Literature DB >> 30115772

Beyond simple tests of value: measuring addiction as a heterogeneous disease of computation-specific valuation processes.

Brian M Sweis^1,2, Mark J Thomas^2,3, A David Redish².

Abstract

Addiction is considered to be a neurobiological disorder of learning and memory because addiction is capable of producing lasting changes in the brain. Recovering addicts chronically struggle with making poor decisions that ultimately lead to relapse, suggesting a view of addiction also as a neurobiological disorder of decision-making information processing. How the brain makes decisions depends on how decision-making processes access information stored as memories in the brain. Advancements in circuit-dissection tools and recent theories in neuroeconomics suggest that neurally dissociable valuation processes access distinct memories differently, and thus are uniquely susceptible as the brain changes during addiction. If addiction is to be considered a neurobiological disorder of memory, and thus decision-making, the heterogeneity with which information is both stored and processed must be taken into account in addiction studies. Addiction etiology can vary widely from person to person. We propose that addiction is not a single disease, nor simply a disorder of learning and memory, but rather a collection of symptoms of heterogeneous neurobiological diseases of distinct circuit-computation-specific decision-making processes.

Entities: Chemical

Mesh：

Year: 2018 PMID： 30115772 PMCID： PMC6097760 DOI： 10.1101/lm.047795.118

Source DB: PubMed Journal: Learn Mem ISSN： 1072-0502 Impact factor: 2.699

Addiction has largely been considered to be a disorder of learning and memory. However, the information stored as memories in the brain are diverse, especially when considering the various circuits or systems within which they are stored. Furthermore, each memory system can play unique roles in various computations that are used during distinct aspects of decision-making information processing. Thus, by considering not only how heterogeneous memory processes might go awry in addiction but also how they might give rise to separable computation-specific decision-making vulnerabilities, we can refine our understanding of and therapies for distinct addiction etiologies. In this article, we will review literature discussing the intimate link between multiple memory systems and multiple decision-making systems within the brain and how recent advancements in circuit-dissection tools make interrogating these related processes more tractable. Specifically, we discuss the benefit of experimental approaches that measure and intentionally manipulate the cellular mechanisms of plasticity directly—approaches that have made strides in defining addiction as a neurobiological disorder of learning and memory. However, we emphasize that such approaches toward understanding the functional consequences of cellular and synaptic remodeling on behavior depend on tasks that discriminate distinct aspects of the on-going processing of that stored information. We argue that this framework can reveal more about the decision-making processes that may go awry in addiction. To this end, we highlight the utility of moving beyond simple tests of value and adopting neuroeconomic theories to drive the design of complex behavioral paradigms. By doing so, we aim to expand the perspective of addiction as not only a storage disorder of learning and memory, but also an access disorder of decision-making information processing. With these two experimental approaches in mind—circuit-specific interrogation of plasticity directly and the use of complex neuroeconomic decision-making tasks that can dissociate distinct on-going valuation processes—we revisit the possible origins of addiction pathology and the heterogeneity with which addiction-related circuit remodeling could take place. We argue that combining these two approaches can reveal a much deeper understanding of fundamentally distinct addiction etiologies and can aid in tailoring effective treatment toward a given etiology's circuit-specific computational dysfunction. To illustrate this, we highlight recent discoveries that demonstrate the utility of this combined approach. Lastly, we provide suggestions as well as cautions moving forward for future experimental work of interest to those studying addiction.

From memory to decision-making

Any physical change in the brain that results from an experience can be considered to be a memory because such changes provide information about the historical past. Thus, addiction has been proposed to be a neurobiological disorder of learning and memory because drugs of abuse can leave lasting changes on the structure and function of the brain (Hyman 2005; Le Moal and Koob 2007; Volkow and Morales 2015). These changes are thought to underlie why individuals with addiction struggle with making poor decisions. There is an intimate link between memory and decision making. It can be argued that the only reason we learn and remember things is to make better decisions (Redish and Mizumori 2015). Information stored as memories within and between neural structures guide decision processes (Euston et al. 2012). Therefore, if addiction is considered to be a neurobiological disorder of learning and memory, it should also be considered a neurobiological disorder of decision-making information processing. It is thought that humans have evolved in such a way that the brain is capable of storing information in multiple, separate memory systems each of which afford unique evolutionary advantages (Sherry and Schacter 1987). Theoretically, the existence of multiple memory systems can only afford evolutionary advantages when each system is specialized in such a way that the functional problems and environmental demands overcome by one system cannot be handled by the properties of another system, which could have been shaped by natural selection and adapted to serve other purposes (Rozin and Schull 1988). The definition of a memory system refers to interactions between separable mechanisms of information acquisition, retention, and retrieval that operate under certain rules, which may be fundamentally distinct from a separate memory system (Sherry and Schacter 1987). Taken together, multiple separate mechanisms of memory acquisition, storage retention, and retrieval are thought to take place in neurally dissociable systems. These principals of neurally distinct memory systems are not just limited to stages of memory formation (i.e., acquisition, storage, retrieval) but also extend to different types of information that can be acquired, stored, and retrieved. Multiple memory systems vary in terms of other properties, including the rate of learning or level of generalizability versus specificity of stored information (O'Keefe and Nadel 1978; Squire et al. 1993; Schacter and Tulving 1994). For instance, gradual, incremental learning involved in the acquisition of specific skills is thought to occur in a separate memory system distinct from rapid one-trial learning tied to relationships among specific episodes (Morris et al. 1982; Yin et al. 2004; Tse et al. 2007) or events with salient affective properties (Berridge and Robinson 1998; Dayan and Balleine 2002; Corbit and Balleine 2005). In the former example, practicing and updating repetitive motor programs over numerous trials are thought to depend on a form of reinforcement learning critically dependent on structures within the basal ganglia, including the caudate and putamen regions of the dorsal striatum (Packard and Knowlton 2002; Balleine et al. 2007; Graybiel and Grafton 2015). This memory system, typically referred to as procedural memory, is often spared in individuals with temporal lobe lesions that precipitate impairments in either episodic memories thought to be part of a distinct, hippocampal-dependent learning system (O'Keefe and Nadel 1978; Cohen and Squire 1980; Squire et al. 1993; Cohen and Eichenbaum 1993; Redish 1999, 2013) or emotional memories associated with specific stimuli thought to be part of an amygdala-dependent learning system (LeDoux 1998; Corbit and Balleine 2005; LeDoux and Daw 2018). Double- and triple-dissociations between separable brain structures and multiple representational forms of memory have been demonstrated in rodents using cleverly designed behavioral paradigms where the rules or contingencies of the task require the use of different types of information stored in separable brain regions. For instance, by using rats trained on variants of a standard radial arm maze memory task that differed only in the contingencies required to successfully obtain rewards, brain region-specific lesions were capable of disrupting performance on select variants of the task but not others (McDonald and White 1993). Dorsal striatum lesions produced deficits in win-stay contingencies, sparing performance on win-shift or cued contingencies, which were sensitive to hippocampus and amygdala lesions, respectively (McDonald and White 1993). Similarly, in rats trained on a standard T-maze memory task, hippocampal versus dorsal striatum lesions could differentially affect performance depending on the degree to which animals were trained. Prolonged training under regular contingencies rendered behavior no-longer sensitive to hippocampal lesions but instead sensitive to dorsal striatum lesions (Tolman 1948, Hull 1952; Packard and McGaugh 1992; Gardner et al. 2013; Schmidt et al. 2013). Taken together, the acquisition and expression of certain types of memories appear to take place in neurally separable learning and memory systems that differ depending on a number of properties of that stored information (O'Keefe and Nadel 1978; Squire et al. 1993; Schacter and Tulving 1994; Redish 1999, 2013; van der Meer et al. 2010, 2012).

Multiple decision-making systems

Just as separable memory systems are capable of storing different types of information, neurally dissociable decision-making systems exist to access those separate aspects of stored information. How data is stored can change how it is processed during decision-making. There is a tight relationship between the multiple representational forms that underlie memory and the multiple action-selection systems that are in play when accessing that stored information. Properties that govern differences in the cellular mechanism of storage, rate of learning acquisition, degree of information distribution across cells, and the different circuit networks within which these processes take place can confer differences in how that stored information is accessed. Multiple decision-making systems can operate in parallel with one another and provide trade-offs between decision properties, such as speed of processing, depth of planning, degree of flexibility, and a diversity of other factors that can influence choice (van der Meer et al. 2012). Multiple decision-making systems, which can be updated through unique forms of learning and are thought to reside in separable neural circuits, are thought to have evolved over time because each can be better suited for different situations (O'Keefe and Nadel 1978; Doya 1999; Hikosaka et al. 1999; Daw et al. 2005; Rangel et al. 2008; Redish 2013). Recent theories in neuroeconomics suggest that complex decisions are multifaceted and reward valuations can arise from dissociable computations in distinct neural circuits (Loewenstein et al. 2008; Rangel et al. 2008; van der Meer et al. 2012). For instance, decisions driven by emotion, decisions planned out after extended deliberation, and decisions made from practiced habit, each arise from dissociable neural processes dependent on different neural circuits. This concept is similar to the fact that multiple memory systems uniquely related to each of these three previous examples can exist in the brain, but importantly differs in that such separable decision processes can gain access to these different types of memories simultaneously and in parallel with one another during on-going behaviors. Thus, carefully designed behavioral tasks are required to elucidate how multiple, parallel decision-making systems work together or compete with one another in order to access separable memories and drive behavior in the moment. Pavlovian action-selection systems entail genetically hardwired motivational state-response action pairs that are capable of being associated with predictive stimuli through conditioning (Clark et al. 2012; Dayan and Berridge 2014). Physiological states are capable of driving motivation (e.g., hunger) and are linked to unconditioned responses (e.g., salivating). Importantly, these processes can be directly transferred to informative cues in the world. For example, images that are associated with a certain reward, rather than simply predicting upcoming reward availability or opportunities, can themselves adopt intrinsic value. This concept, termed incentive salience, can trigger feelings of wanting or craving in response to cue presentation and promote reward-seeking behaviors (Robinson and Berridge 1993; Bernheim and Rangel 2004; Berridge and Robinson 2016). The role of amygdala-related circuitry has been heavily implicated in these mechanisms (Clark et al. 2012; Wassum and Izquierdo 2015). Such circuits carry learned representations of sensory stimuli and integrate that information with motivational processes (LeDoux and Daw 2018). Through failure modes in these mechanisms, addiction-related cues are capable of triggering decision-vulnerable states that lead to maladaptive motivated behaviors, ultimately precipitating relapse (Bernheim and Rangel 2004; Robinson and Flagel 2009; Walters and Redish 2018). Deliberative action-selection systems entail declarative, episodic evaluation processes rooted in simulations of possible future response-outcome scenarios (Redish 2016). Deliberative valuation algorithms operate relatively slowly yet remain flexible. Hippocampus and regions of the prefrontal cortex have been heavily implicated in these mechanisms (Johnson et al. 2007; van der Meer et al. 2012; Wang et al. 2015). Failure to engage deliberative algorithms when making decisions without planning (Everitt and Robbins 2005), a reduction in capacity to accurately simulate imaginations of possible future scenarios (Kurth-Nelson et al. 2012), or errors in the future scenarios themselves (Goldman et al. 1987; Redish and Johnson 2007; Kurth-Nelson and Redish 2012), as well as errors in the value estimates of those future scenarios (Tiffany 1995; Naqvi and Bechara 2010) each describe fundamentally distinct and dissociable vulnerabilities in decision-making information processing within the deliberative system. Procedural action-selection systems entail well-practiced behavioral sequences that are released ballistically following the recognition of appropriate situations (Graybiel 1998; Redish 2013; Graybiel and Grafton 2015). These decision processes operate quickly yet are relatively inflexible and rely on motivational components accessed via cached value representations (Daw et al. 2005). The dorsal striatum has been heavily implicated in these mechanisms (Saint-Cyr 1988; Berke et al. 2009; Gremel and Costa 2013; Smith and Graybiel 2013). Such circuits are recruited over many trials and information stored in these circuits are thought to be acquired through a form of reinforcement-like learning. Possible failure modes in the procedural system include increased valuation due to drug-modifications of dopaminergic signals (Redish 2004; Dezfouli et al. 2009), inabilities to extinguish perseverative motor programs (Peters et al. 2009), and strong habit-like processes that override other valuation algorithms leading to enhanced rates of information stored as procedural memories requiring less-than-usual number of training trials (Piray et al. 2010). These multiple action-selection systems, with their separate vulnerabilities, also interact, for example in the process of Pavlovian-Instrumental Transfer, in which amygdala-driven Pavlovian valuations can change the valuation stage of deliberative decisions occurring in accumbens (Talmi et al. 2008; Corbit and Balleine 2011; LeDoux and Daw 2018). If addiction is to be considered a neurobiological disorder of memory, and thus decision-making, the heterogeneity with which information is both stored and processed must be taken into account. Thus, addiction-related dysfunctions in memory could be diverse and could lead to lasting heterogeneous circuit-specific changes in dissociable decision-making computational processes. These multiple vulnerabilities could generate subtly different behavioral phenotypes; however, it is also possible for distinct failure modes in separate systems to produce identical behavioral dysfunctions. Fortunately, current theories of decision-making are capable of making explicit predictions that can reveal critical differences in the neural computations that underlie those behaviors (Redish et al. 2008; Walters and Redish 2018).

Beyond simple tests of value

Over the last four decades, behavioral neuroscience has developed a variety of tasks that can separate these decision-systems, by putting them into conflict with each other. For example, classic studies of devaluation that find differences between different training contingencies (Balleine and Dickinson 1998; Coutureau and Killcross 2003), or studies of water maze behavior that depend on training (Morris et al. 1982; Eichenbaum et al. 1990; Day et al. 1999; Redish 1999), or the classic T-maze and other contingency-dependent tasks (Barnes et al. 1980; Packard and McGaugh 1992; Gardner et al. 2013; Schmidt et al. 2013). In more recent work, neuroeconomics has refined these tasks and developed additional tasks that can measure the valuation of these components directly (Coricelli et al. 2005; McCoy and Platt 2005; Hayden et al. 2008, 2009; Abe and Lee 2011; Kalenscher and van Wingerden 2011; Steiner and Redish 2014; Sweis et al. 2018a,b). In contrast, many of the tasks used in rodent models of compulsive drug-seeking behaviors have relied on paradigms that, by design, are better suited to probe the changes in information stored (i.e., mechanisms of memory), rather than the changes in behavioral processing (i.e., decision-making) (Fig. 1). Such tests include drug conditioned place preference or drug self-administration paradigms (Stafford et al. 1998; Tzschentke 2007). In these paradigms, time voluntarily spent in a drug-paired chamber, or, number of drug infusions self-administered on a progressive ratio schedule serve as the primary behavioral metrics of reward valuation in these tasks. However, additional valuation information can be extracted from these tasks, examining rate of learning acquisition, individual differences in high-responders versus low-responders, or measuring how such behaviors can be extinguished and reinstated following a number of various triggers (Piazza et al. 1990; Carroll and Lac 1993; Gosnell 2000; Lu et al. 2003; Shaham et al. 2003; Perry et al. 2005). These latter examples have been used to model “relapse” in nonhuman animals and have provided a foundational understanding about the neurobiological mechanisms of reward-related learning and memory associated with addiction.

Figure 1.

Tasks design matters when probing memory versus decision-making processes. (A) Memory and decision-making are thought to exist as duals of each other. How information is stored changes how it is processed. Different decision-making mechanisms access stored information traded off in different ways, and thus, select actions by fundamentally distinct computational algorithms. (B) Tasks that interrogate processes on varying time scales are better suited to probe memory versus decision-making computations. (C) Tasks designed measuring behaviors on those longer time scales (days) versus shorter time scales (within trial) are better suited to probe memory mechanisms (information storage, consolidation, updating) versus decision-making mechanisms (information processing and action-selection valuations). (D) Two task examples better suited to probe either memory processes (traditional paradigm: self-administration task) and decision-making processes (neuroeconomic paradigm: restaurant row task), both of which are capable of investigating aspects of reward-related self-control. (Top) In traditional operant chamber paradigms, principal or initial reward valuations (for food or drug) are measured during an acquisition learning period (usually across trials or days, estimated via break point on a progressive ratio lever-press sequence). Extinction periods can probe rates with which new valuation processes are learned that suppress principal valuations (across days). Active maintenance of extinction learning, or susceptibility to lose suppression following reinstatement, implies principal valuation memories coexist with extinction memories yet such competing computations are not accessible in traditional operant paradigms. (Bottom) In neuroeconomic paradigms, reward value can be calculated a number of ways within a single trial. In this version of the restaurant row task, hungry mice are trained to forage for food rewards of varying costs (delay, cued tone pitch) and subjective value (flavor, spatial contexts or restaurants) while on a limited time budget. Decisions are deconstructed into discrete stages in separate offer zones and wait zones on each trial in each restaurant. Each action-selection process reflects a valuation computation, each of which reflect different economic algorithms (choose between entering versus skipping in the offer zone, deciding to opt out and quit versus remain patient until earning in the wait zone, taking time to consume a pellet and linger in a conditioned place versus leave and advance to the next trial). In each of these action-selection processes, decision conflict and self-control can be separately captured between highly desired although expensive reward opportunities. When addiction models have been tested with tasks derived from behavioral neuroscience, conflicting results have often been found. For example, rats that were willing to expend more lever presses in a progressive ratio schedule for cocaine or heroin than for saccharine, still chose the saccharine when given an option between the two (Ahmed 2005, 2010; Ahmed et al. 2013; Perry et al. 2013; Vandaele et al. 2016), consistent with experiments in monkeys that found differences between separated, bundled, and contrasted options (Nadar and Woolverton 1991, 1992; Nader et al. 1993; Czoty et al. 2005; John et al. 2015). While most rats would take cocaine if it was the only option available, only a subset showed a willingness to cross a shock for it (Shalev et al. 2000; Deroche-Gamonet et al. 2004; Belin et al. 2009; Deroche-Gamonet and Piaza 2014; Martin-Garcia et al. 2014). While many studies have shown that drugs (cocaine, alcohol, heroin) do not respond to devaluation (Dickinson et al. 2002; Everitt and Robbins 2005; Zapata et al. 2011; Everitt 2014), suggesting they are due to procedural and habit processes, other studies have shown that subjects can plan complex novel sequences to drug rewards (Olmstead et al. 2001) and that rats will use knowledge of historical valuations to plan options (Marks et al. 2010). Recent theories in neuroeconomics suggest that decisions made in different situations derive from different valuation functions residing in separable neural circuits. It can be difficult to segregate these parallel information processing algorithms using traditional experimental addiction paradigms that rely on simple tests of value and compulsive drug-seeking behavior. Even within the same trial, decision algorithms can change and thus the computations ultimately driving behavior can be multifaceted (Sweis et al. 2018a,b). For instance, the value assigned to choosing one option over another can be calculated through distinct discounting functions depending on what information is being incorporated in a given decision (e.g., intertemporal deliberative choices deciding between “this option or that option” versus foraging choices deciding “to give up on and abandon the current endeavor,” Carter and Redish 2016; Sweis et al. 2018a,b). In this light, distinct neural dysfunctions in either of these processes could ultimately lead to maladaptive behavioral consequences that might be indistinguishable in simple experimental paradigms. Reinforcement learning protocols, in the form of drug conditioned place preference and self-administration, provide useful information about mechanisms of drug memory consolidation and retrieval by measuring acquisition, extinction, and reinstatement parameters. These processes are revealed across sessions on the timescale of days to weeks. Furthermore, these tasks can characterize how existing memories change slowly over time when contingencies are updated. This is clearly demonstrated when reward-seeking behaviors are extinguished over time (across days). This is also seen in reinstatement sessions when learning is updated and drug-seeking behaviors reemerge in models of relapse. For example, numerous studies of extinction-reinstatement processes have determined how extinction learning processes acquire new valuations (in this example, “not to seek drug rewards”) that override existing, originally learned drug-seeking valuations. Importantly, it is widely accepted that this form of learning is acquired in addition to existing valuations rather than a process in which old valuation learning is removed or forgotten (Bouton 2004). These separable learning and memory systems indeed occur in distinct neural circuits (Berman and Dudai 2001; Suzuki et al. 2004; Peters et al. 2009). Thus, multiple memory systems that exist in parallel can separately contribute to behavior in these tasks on a moment by moment basis. However, because these forms of learning develop over long timescales and because the primary behavioral output on these simple tasks is binary (e.g., press a lever or not), these tasks are unable to capture the different decision-making computations that might lead up to and produce behaviors, particularly the conflict between systems. For instance, following the extinction of conditioned place preference, it is accepted that originally learned valuations are not simply removed or forgotten but rather new overriding valuations are secondarily learned and that both forms of memory remain intact (Rescorla 2001; Bouton 2004). However, it remains unclear how both types of stored information compete with one another since they are thought to coexist. That is, how is competition between multiple decision-making systems resolved before an action is ultimately selected or not? Regardless if extinction is maintained or if reinstatement is precipitated, how are these separately stored reward-related memories accessed and integrated to produce a single behavioral output? The decision to leave the reward-paired chamber versus not to leave could reflect fundamentally distinct neural computations from a decision to remain in the unpaired chamber versus actively seeking out the reward-paired chamber (German and Fields 2007a,b). These separate computations could be differentially disrupted in distinct forms of addiction. A similar argument could be made among various subtle action-selection processes in drug self-administration tasks (e.g., differences in trained lever presses versus naturalistic nose pokes) (Gerhardt and Liebman 1981). Cross-task comparisons imply that hypotheses of singular, objective definitions of reward value are problematic and lead to economic paradoxes. For instance, reward value measured by calculating the breakpoint of lever pressing in a progressive ratio self-administration paradigm for drug in one session versus saccharin in a separate session is inconsistent with reward value measured in the same animals by calculating the probability of choosing drug over saccharin in a two-alternative forced-choice paradigm (Deroche-Gamonet et al. 2004; Kasanetz et al. 2010; Perry et al. 2013; Vandaele et al. 2016). Choosing between options is thought to access fundamentally distinct processes from choosing to remain committed to versus abandoning current endeavors (Carter and Redish 2016; Redish et al. 2016). The former is thought to recruit deliberative processes (Redish 2016) while the latter is likely driven by Pavlovian associations embedded in foraging processes (Dayan et al. 2006). Thus, reward value is not singular within the brain, but rather depends on the separate neural algorithms used to compute value in distinct decision systems. By moving beyond simple tests of value, complex tasks may be better able to operationalize reward value in a number of various ways, sometimes within the same task, and even within the same trial (Fig. 1). This allows neural computations to be more readily dissociable during on-going decisions. Tasks rooted in theories of fundamentally distinct valuation algorithms are capable of dissociating neural computations underlying specific aspects of decision-making information processing, including multiple dimensions of reward value (e.g., effort, price, opportunity cost, reward magnitude, budget constraints [Wikenheiser et al. 2013; van Wingerden et al. 2015; Salamone et al. 2018; Sweis et al. 2018a,b]) separate from other behavioral processes (e.g., locomotor capabilities, spatial and semantic knowledge of task rules and contingencies), accessing revealed subjective valuations rather than assuming reward magnitude objectively (e.g., using reward quality and individual preferences, not reward quantity [Levy and Glimcher 2011; Steiner and Redish 2014, Sweis et al. 2018a,b]), introducing changes in external demands (e.g., leaner versus richer environments [Wikenheiser et al. 2013; Sweis et al. 2018a]), and deconstructing stages of decision-making discretely within trial (e.g., stepwise information presentation and action-selection processes leading up to reward earning separated across space and time [Sweis et al. 2018a,b]). These novel task designs can access economic principles in human decision-making (both those well-studied and those still perplexing; e.g., theories of demand elasticity and compensation, counterfactual processing, regret, sunk costs, post-purchase rationalization, and other cognitive heuristics [Camille et al. 2004; Coricelli et al. 2005; Hayden et al. 2009; Abe and Lee 2011; Steiner and Redish 2014; Sweis et al. 2018a,b]). Importantly, neurally dissociable decision-systems, including Pavlovian, deliberative, and procedural processes can be more closely tracked through behavior separated across space and time within trial, as well as revealing what motivating forces drive those decision processes to update across trials. In particular, much can be learned during circumstances of decision conflict. This can be operationalized in a number of ways. Especially if adopted within a neuroeconomic framework, multiple types of conflict scenarios that have not been easily measurable in the past using nonhuman animals can be constructed along a continuum of competing costs and rewards, including conflicts between reward and threat, between reward and self-inflicted pain, or between wanting highly desired although expensive rewards and knowing better to forgo such opportunities for economically smarter alternatives (Friedman et al. 2015; Guo et al. 2015; Resendez et al. 2016; Kim et al. 2017a, Sweis et al. 2018a,b). Such neuroeconomic paradigms can pit multiple decision-making systems against one another and can reveal hidden information about the computational processes underlying specific aspects of behavior and within what neural constraints they might operate. Furthermore, such paradigms are naturalistic, do not rely on introspection to reveal dissociable cognitive processes, and are easily translatable across species (Sweis et al. 2018b). Through this approach, computational processes of specific neural mechanisms can be revealed to be conserved across evolution and can be more thoroughly studied using the variety of tools, disease models, and patient populations available across species.

On circuit heterogeneity

In order to probe the multiple memory systems and multiple decision-making systems in the brain, circuit-specific and temporally precise tools that can gain access to endogenous mechanism of both information storage and information processing are required. Recent advances in circuit-dissection tools have made studying specific populations of cells possible (Yizhar and Adamantidis 2018). Conditional genetics have allowed experimenters to define interrogation parameters based on cell-expression profiles, activity-dependence, and projection specificity, leveraging voltage- and calcium-sensing reporters with novel imaging or opto-tagging techniques and utilizing chemogenetic or optogenetic manipulations (Guo et al. 2015; Resendez et al. 2016; Kim et al. 2017b). As a result, newer studies have been able to increase the functional resolution of diverse circuit-specific specialization. Recent delineation of circuit heterogeneity of the mesocorticolimbic dopaminergic system illustrates this point. Novel circuit-dissection tools have revealed the wide diversity of unique inputs into the VTA on dopaminergic and nondopaminergic neurons as well as the wide diversity of unique VTA output structures that include the cell-type identity of their targets (e.g., medial VTA dopaminergic neurons that receive glutamatergic input from the lateral habenula and project to D1-receptor-containing GABAergic medium spiny neurons in the shell of the nucleus accumbens) (Morales and Margolis 2017). Pathways thought to be critically affected by addiction are being subdivided at a rate faster than the functional roles of those different subcircuits are being realized (Britt et al. 2012; Tye 2012; Morales and Margolis 2017). With these tools available, we suggest that a critical factor to aid in our understanding of the neurobiology of addiction as the field moves forward requires taking into consideration the multiple memory and multiple decision-making systems. Importantly, this means not only moving beyond simple tests of value, which are commonplace in many addiction studies, but also designing the circuit-manipulation approach so as to carefully probe the link between memory and decision-making processes. To accomplish this, we argue that direct “off-line” manipulations of circuit-specific plasticity can help realize the functional consequences of synaptic remodeling on endogenous circuit-computation-specific information processing. Because lasting heterogeneous aspects of memory can be altered due to a variety of addiction-related pathologies, such changes can give rise to unique susceptibilities to addiction via vulnerabilities in circuit-specific computational processes that might otherwise be masked either by using simple tests of value or by using circuit interrogation techniques that disrupt endogenous information processing. We discuss these concepts further below.

Off-line induction of circuit-specific plasticity

Many of the interrogation approaches used in the majority of recent optogenetic circuit-specific dissection studies in nonhuman animals rely on direct activity manipulations during on-going behaviors (Fig. 2A,B; Morales and Margolis 2017). That is, these manipulations are delivered “on-line” in real time affecting neural computations that drive behaviors. In fact, the vast majority of optogenetic behavioral studies generally fall under Figure 2A (Morales and Margolis 2017), while only a handful fall under Figure 2B (Schelp et al. 2017). In either case, two critical limitations exist in this approach in relation to studies of addiction and decision making. The first, previously discussed, is that the majority of behavioral tasks used in circuit-dissection studies comprise simple tests of value. The second is that on-line manipulations impose disruptions of endogenous neural signaling and provide little insight into the functional consequences of synaptic remodeling on information encoding.

Figure 2.

Neuromodulation intervention strategy in combination with task design matters. (A,B) Online neuromodulation manipulations (e.g., circuit-specific optogenetic stimulation) describe those where stimulation (either activation of excitatory opsins like channelrhodopsin-2 [ChR2] or inhibitory opsins like halorhodopsin [HaloR]) is delivered during on-going behaviors of interest. This could be time-locked to cue or lever-presentation in traditional paradigms (A) where extinction maintenance or reinstatement susceptibility can be assessed. This could also be time-locked to distinct decision-making action-selection processes in neuroeconomic paradigms (B) during reevaluative change of mind decisions, for instance, only in high-conflict economic scenarios. However, in either (A) or (B), endogenous neural activity is disrupted. While this can reveal important information regarding on-going neural dynamics necessary or sufficient for certain behaviors, on-line neuromodulation actually reveals little regarding the functional consequences of synaptic plasticity in relation to addiction-related changes in neural circuitry. (C,D) Off-line neuromodulation interventions are capable of directly manipulating circuit-specific plasticity. For instance, well-characterized plasticity-inducing stimulation protocols (induction of long-term-depression in glutamatergic cortical pyramidal projections into the nucleus accumbens following 10 min of 10 Hz stimulation via ChR2) can be delivered acutely outside of behavioral testing. By observing lasting changes in behavior at later time points, the functional consequences of synaptic remodeling can be realized (e.g., mimicking disease states or reversing them). (C) By applying this approach in traditional paradigms, the functional consequences of circuit-specific synaptic remodeling on memory-related processes can be realized. (D) By applying this approach in neuroeconomic paradigms, the functional consequences of circuit-synaptic synaptic remodeling on separable decision-making computational processes can be realized. An alternative approach is to alter directly the synaptic efficacy of signal transmission in specific circuits through alterations in synaptic plasticity (Fig. 2C,D). The goal of this approach is to change the weight of the information endogenously transmitted through a specific circuit, but not disrupt the information that is coded in this specific circuit during behavior. Importantly, these types of manipulations thus are delivered “off-line” outside of behavioral testing. To date, only a handful of studies fall under Figure 2C which we will discuss below while only a single recent study has adopted the approach in Figure 2D, which we will return to at the end of this review. Only recently have tools been developed to directly manipulate the strength of circuit-specific synapses. Plasticity-altering interventions can now be delivered to specific synapses via optogenetics by enabling opsins in a circuit of interest and delivering a brief stimulation protocol in order to elicit lasting changes in synaptic efficacy. Using this approach, for example, excitatory opsins expressed in input-specific glutamatergic pathways can activated in a temporally precise manner to intentionally elicit long-term potentiation or depression. These tools have been used to good effect in several recent studies of the neurobiology of addiction (Pascoli et al. 2011, 2014; Ma et al. 2014; Creed et al. 2015; Hearing et al. 2016; Benneyworth et al. 2018). However, because such studies were performed using standard animal models of addiction (psychomotor sensitization, drug self-administration, or conditioned place preference), the functional consequences of synaptic remodeling on memory processes (e.g., consolidation, maintenance, extinction, or retrieval) and not decision-making processes were probed. Studies of behavioral extinction in self-administration paradigms support a model of how plasticity in specific corticostriatal circuits are necessary for learned self-control (Peters et al. 2008; LaLumiere et al. 2010; Barker et al. 2012; Bossert et al. 2012; Gass and Chandler 2013; Keistler et al. 2015; Augur et al. 2016). For instance, long-term depression induced by optogenetically manipulating glutamatergic-specific excitatory pyramidal neurons that project from the infralimbic subregion of the prefrontal cortex to the shell of the nucleus accumbens is capable of triggering reinstatement behaviors (Benneyworth et al. 2018). This idea is consistent with the “hypofrontality” model of addiction that characterizes the inability of individuals with weaker corticostriatal connects to regulate maladaptive motivated behaviors (Kalivas and Volkow 2005; Bickel et al. 2007; Chen et al. 2013; Camchong et al. 2014). However, neuromodulation studies applying the same plasticity-inducing procedure in cocaine-abstinent mice versus morphine-abstinent mice produce opposing findings using simple drug conditioned place preference tests (Hearing et al. 2016; Benneyworth et al. 2018). Similar optogenetically induced plasticity interventions (here, long-term depression) prevent drug-prime induced reinstatement of drug-seeking behavior in morphine-abstinent mice (Hearing et al. 2016), but provoke spontaneous reinstatement of drug-seeking behavior in cocaine-abstinent mice (Benneyworth et al. 2018). This distinction is critically important if we want to design treatments as it suggests the same treatment can be dysfunction-preventing in some situations, but dysfunction-provoking in others. This is especially concerning when addictions to different substances of abuse are often lumped together, both neuroscientifically and clinically. It is possible that cocaine-abstinent and morphine-abstinent mice may have undergone changes in fundamentally distinct computational processes despite appearing grossly similar by the end of extinction training (Badiani et al. 2011). Thus, altering plasticity at a single connection can lead to drastically different behavioral outputs. By not carefully measuring the decision-making computational processes that may have separately gone awry in cocaine-abstinent mice versus morphine-abstinent mice, knowing how to treat potentially distinct decision-making vulnerabilities becomes guess-work at best. Put simply, the functional consequences of synaptic remodeling on the discrete neural computations that drive on-going behavior remain ambiguous when tested in simple behavioral paradigms optimized for probing mechanisms of memory and not necessarily decision making. Direct interrogation of the synaptic efficacy of specific circuits is an important tool for determining how information is processed as decisions are made. However, manipulating plasticity to understand the functional consequences of circuit-specific synaptic remodeling on distinct aspects of decision-making information processing is only as useful as the task utilized is sensitive to circuit-computation-specific behaviors. Combining circuit-specific off-line neuromodulation with complex testing that reveal separable behavioral computations is crucial. This combined approach will be critical for the development of disease-mitigating neuromodulation therapies, particularly those in which the benefit is intended to outlast the duration of neural stimulation and especially not unintentionally worsen disease prognosis. In order to begin to appreciate the complexity with which addiction-related processes can give rise to heterogeneous circuit dysfunctions, next we will briefly discuss the varying plausible levels of addiction pathogenesis in this context before returning to how a combined approach of decision-making neuroeconomics and off-line plasticity manipulations can aid in resolving disease heterogeneity.

On addiction heterogeneity

Drug-related experiences are capable of leaving lasting influences on the brain and behavior (Robinson and Berridge 2003; Le Moal and Koob 2007). Thus, problematic drug abuse and susceptibility to relapse is often attributed to neuronal pathologies in mechanisms of learning, memory, and plasticity (Nestler 2001; Hyman and Malenka 2001; Hyman 2005; Kauer and Malenka 2007; Thomas et al. 2008; Lüscher and Malenka 2011). This view has dominated much of addiction neurobiology research. However, an ultimate understanding of addiction pathogenesis will depend on appreciating the multiple plausible causes of drug-use (Fig. 3).

Figure 3.

On addiction heterogeneity: Classes of plausible dysfunctions. (A) All drug use can be subdivided into casual drug use (majority) versus problematic drug use. (B) Individuals with problematic drug use can be divided into those with pathologies (originating cause) due to external, social factors versus pathologies rooted within the individual either via a predisposing vulnerability or a direct change induced by an ingested substance. (C) Internal pathologies can be divided into those with primary changes in neurons versus nonneuronal players (e.g., glia). (D) Neuronal changes, or plasticity, can be divided into changes that come about from normal mechanisms of learning and memory or a dysfunctional breakdown of such processes that normally do not occur. (E) Normal mechanisms of learning and memory can be driven in reward-related circuits by dopamine-mediated processes or non-dopamine-mediated processes (e.g., endocannabinoid signaling). Blue arrows in between nodes indicate interaction pathologies that could have either unidirectional or bidirectional influences on each other. (F) Any resultant changes in the brain that arise from internal pathologies, regardless of the underlying primary mechanism, can each induce failure modes in dissociable neural circuits, each of which can give rise to fundamentally distinct addiction etiologies in separable neural computations. The vast majority of drug users in fact do not go on to display problematic drug use (Fig. 3A; Anthony et al. 1994). Comparisons between casual drug users and problematic drug users can perhaps reveal fundamental differences in neurobiological functions underlying the severity and chronicity of or resiliency from relapse. There is a large body of work in both human and nonhuman animals comparing compulsive drug-seeking behaviors to non-drug-exposed and non-drug-treated controls. There is a relatively smaller literature comparing individual differences in compulsive drug-seeking behaviors to casual drug consumption. However, several recent studies have suggested that even within standard animal addiction models, only subsets of animal subjects show phenotypes that closely resemble human addictions (Ahmed 2005, 2010; Ahmed et al. 2013; Pickard et al. 2015). For example, Vandaele et al. (2016) found that although all rats tested showed larger progressive ratio breakpoints to cocaine and heroin than to saccharin, most rats would choose saccharin over cocaine or heroin if given an actual choice between them. Perry et al. (2013) found that the subset of rats that would choose cocaine or heroin over saccharin in the two-available-choice condition showed other similar addiction-related phenotypes. Deroche-Gamonet et al. (2004; see also Kasanetz et al. 2010) found that, while all rats would lever-press or nose-poke for cocaine, only a subset would cross a shock to reach the drug, and that same subset showed excessively high progressive ratio breakpoints. Jaffe et al. (2014) found that the subset of nicotine-seeking rats that showed excessively high progressive ratio breakpoints also showed a lack of Kamin blocking response to nicotine, even though their Kamin blocking response to food was normal. Most studies investigating the neuroscience of addiction pathology have focused almost exclusively on those due to internal factors within a subject, meaning within the brain (Fig. 3B). Pathology, by definition, is derived from the Greek words “pathos” meaning “disease” and “logos” meaning “treatise,” describing the science of the causes and origins of disease states. Sometimes the term pathology can be misused and instead of referring to disease origin, may sometimes be ascribed to reflect measurable evidence of disease state, either as dysfunctions downstream from disease origin or even disease symptomology. However, in certain cases, the cause and origin of problematic drug use can be entirely attributed to external (i.e., environmental) factors. For example, in a neural system that may be working perfectly normally, with no predisposition or known susceptibility to drugs, it might be the case that associations constructed in certain social settings can drive drug abuse. For instance, television ads or the entertainment industry that publicize, glorify, or sexualize drug use can drive problematic drug use in viewers with access to those media. In such cases, a failure in an individual's brain may not be the underlying origin of his or her addiction, nor might the mechanism of action of the ingested substance on the individual's brain be the origin of maladaptive behavior. In such cases, successful treatments may need to be rooted in environmental and social interventions. Of course, two individuals who undergo similar media exposures might yet emerge with different disease states. Thus, for such questions, problematic drug use is likely rooted in an interaction pathology between the individual neural system and the external environment (Fig. 3B blue arrows). Many of the neuroscience studies of addiction and the brain have focused almost exclusively on those with neurophysiological etiologies (Fig. 3C). However, recent work highlights the involvement of key nonneuronal players including glia and the immune system, both centrally and peripherally located, that could serve roles as sources of dysfunction, potential diagnostic or prognostic disease biomarkers, or possible novel avenues for therapeutic intervention. Astrocytes and microglia functions, including those critical for synapse formation, can be disrupted by alcohol, psychostimulants, and opioids through either direct means, indirect means through neuronal signaling, or through the innate immune system, which together contribute to added drug-abuse liability (Navarrete and Araque 2008, 2010; Fox et al. 2012; Araos et al. 2015; Northcutt et al. 2015; Lewitus et al. 2016; Karlsson et al. 2017; Lacagnina et al. 2017; Ostroumov and Dani 2017; Calipari et al. 2018; Neuhofer and Kalivas 2018). It is certainly possible that primary dysfunctions in nonneuronal players could give rise to secondary neuronal dysfunctions and vice versa with interacting pathologies (Fig. 3C blue arrows), but it is important to consider primary dysfunctions in order to better appreciate addiction etiology. It is possible that both neurophyisological and glial dysfunctions could exist in fundamentally distinct forms of addiction, yet this remains to be explored. One could imagine a scenario in which primary dysfunction in glia that give rise to maladaptive neuronal plasticity could continue to give rise to maladaptive circuit changes even if neuron-targeted therapies are administered. Within the realm of neuronal changes, some addiction-related changes depend on dysfunctional forms of plasticity and some are instantiated through normal learning mechanisms (Fig. 3D). Addiction-related changes in memory are often thought to be functioning within intact learning mechanisms at the molecular and cellular levels (if perhaps at an accelerated rate or enhanced level) (Hyman and Malenka 2001; Hyman 2005; Kauer and Malenka 2007; Thomas et al. 2008; Zweifel et al. 2008; Harnett et al. 2009; Lüscher and Malenka 2011; Kodangattil and Dacher 2013; Wolf 2016; Hearing et al. 2018). The vast majority of nonhuman-animal studies examining these questions start from the hypothesis that drugs of abuse, either directly or indirectly, take advantage of endogenous reward-related systems and usurp normal mechanisms of learning and memory. While a prominent view, there are several reports suggesting this is only one potential path to drug addiction. In fact, it is important to be careful with the term “plasticity.” Changes in strength of synaptic transmission (e.g., potentiation, depression) either through presynaptic mechanisms (e.g., changes in vesicle release probability) or postsynaptic mechanisms (e.g., changes in receptor densities) can store information about the historical past (Couey et al. 2007; Thomas et al. 2008; Goriounova and Masvelder 2012; Atwood et al. 2014). Changes in plasticity itself, often termed “metaplasticity,” describe a distinct process in which the history of a given structure alters the direction or magnitude of plasticity in response to subsequent stimulation (Abraham and Bear 1996; Lee and Dong 2011; Mameli et al. 2011; Mamelli and Luscher 2011). Thus, metaplasticity describes changes that augment or diminish the overall degree with which a system is capable of changing and thus can interact with normal learning mechanisms (Fig. 3D, blue arrows). The vast majority of neuroscience addiction research has focused on changes in dopaminergic signaling or synaptic pathways known to be modulated by dopamine (Fig. 3E). However, plasticity involving other neuromodulators, including endocannabinoids, serotonin and norepinephrine are known to contribute to learning and memory, and any of these might also be subject to change in addiction (Chevaleyre et al. 2006; Weinshenker and Schroeder 2007; Hernandez and Cheer 2015; Müller and Homberg 2015) and could conceivably interact with each other (Fig. 3E blue arrows). In light of these multiple plausible causes underlying continued drug use, a key concept to consider is to what extent might heterogeneous circuits be differentially affected (Fig. 3F). Even within a single branch of plausible neurobiological mechanisms of addiction—changes mediated through dopaminergic pathways—studies examining drug-induced plasticity have not fully appreciated the heterogeneity of circuits within which these mechanisms may be taking place, especially within the context of decision-making information processing. That is, questions of which circuits and what information underlie addiction-related plasticity remain largely unresolved.

Steps toward resolving addiction heterogeneity

Neuroeconomic tasks are capable of capturing scenarios of economic conflict that can operationalize decision-making concepts of reward value, self-control, and impulsivity in a number of different ways. The Restaurant Row task in rodents can measure the conflict between wanting highly desired rewards offers despite knowing better to seek out smarter alternatives in multiple decision-making valuation systems separated within the same trial (Sweis et al. 2018a). This demonstrates a way for nonhuman animals to communicate “should versus shouldn't” judgements through complex reward-seeking behaviors. This can model in rodents the difficult types of complex decisions humans recovering from addiction struggle with before relapsing, and we found that 2 wk of prolonged abstinence from cocaine or morphine produced dissociable lasting disruptions in these fundamentally distinct decision-making algorithms (Sweis et al. 2018c). In such a neuroeconomic framework, the concept of self-control becomes much more nuanced as it can have different implications for the separable neural computations contained within distinct decision-making systems. This brings into question the “hypofrontality” model of addiction that characterizes the inability of individuals with weaker corticostriatal connections to regulate maladaptive motivated behaviors (Kalivas and Volkow 2005; Bickel et al. 2007). How the strength of connectivity between two brain structures alters separable aspects of valuation processing during on-going behaviors is a field of study that remains at its infancy. Pursuing this aim is at the core of linking mechanisms of memory and plasticity to decision-making processes in order to more deeply resolve heterogeneity in addiction pathogenesis. Supporting this hypothesis of the effect of hypo-frontality on decision-making corrections, and by taking the combined approach described in Figure 2D, we found that long-term depression induced in the infralimbic to accumbens shell pathway produced lasting changes in foraging but not deliberative behaviors measured within the same trial (Sweis et al. 2018d). Furthermore, individual differences in strength of connectivity of this pathway could explain self-control-related decisions in a foraging algorithm independent of and separate from self-control-related decisions in a deliberative algorithm (Sweis et al. 2018d). Interestingly, this circuit manipulation, which has been previously shown to change in both cocaine- and morphine-abstinent mice (Thomas et al. 2000, 2001; Hearing et al. 2018), has also been shown to provoke reinstatement of conditioned place preference in animals exposed to cocaine (Benneyworth et al. 2018) while blocking reinstatement in animals exposed to morphine (Hearing et al. 2016). This approach can reveal the lasting functional consequences of synaptic remodeling on distinct aspects of decision making not accessed using classic experiments and not directly interrogated using on-line neuromodulation approaches. This suggests that circuit-specific manipulations of plasticity are capable of disrupting distinct aspects of decision making relevant to certain subtypes of addiction only revealed when tested on complex behavioral tasks. This also suggests that treatments that may help prolong abstinence and prevent relapse in certain types of addiction validated by testing in simple behavioral paradigms such as conditioned place preference or drug self-administration could potentially worsen disease state in other types of addiction. Of course, many circuits beyond the IL-NACsh pathway are certainly thought to change during the time course of addiction (e.g., drugs on board, during acute withdrawal, immediately following a relapse episode). Demonstrations of deliberative algorithms encoded in hippocampal-prelimbic and prelimbic-accumbens core pathways (Johnson et al. 2007; Powell and Redish 2014; Padilla-Coreano et al. 2016; Papale et al. 2016), circuits which have also been shown to change in addiction models (Chen et al. 2013), make interrogating such circuits prime candidates for next steps using this combined neuroeconomics and plasticity approach.

Conclusion

Neural plasticity is purported to underlie long-lasting maladaptations in behavior, making addiction a chronic disorder of life-long struggle against relapse. It is this property of addiction—lasting changes in synaptic function that persist long after drugs of abuse have cleared an individual's system—that makes addiction a disorder of learning and memory and thus of decision-making. Recent advancements in decision science and neuroeconomics reveal that neurally dissociable valuation processes access distinct memories differently, and thus are uniquely susceptible to change as the brain changes during addiction. Therefore, addiction is best considered not as the disease itself but rather as a collection of symptoms of neurobiological diseases that are proving to be more heterogeneous than previously thought. Furthermore, addiction is not merely a disease of memory, but rather a consequence of changes in how decision-making processes access those memories. The advancement of our understanding of addiction etiology and thus development of better, lasting disease-augmenting therapies tailored to the individual will depend on tasks and neuromodulatory approaches that can access the complexity of circuit-computation-specific processes in the brain.

180 in total

Review 1. Addiction.

Authors: Terry E Robinson; Kent C Berridge
Journal: Annu Rev Psychol Date: 2002-06-10 Impact factor: 24.137

2. Plasma profile of pro-inflammatory cytokines and chemokines in cocaine users under outpatient treatment: influence of cocaine symptom severity and psychiatric co-morbidity.

Authors: Pedro Araos; María Pedraz; Antonia Serrano; Miguel Lucena; Vicente Barrios; Nuria García-Marchena; Rafael Campos-Cloute; Juan J Ruiz; Pablo Romero; Juan Suárez; Elena Baixeras; Rafael de la Torre; Jorge Montesinos; Consuelo Guerri; Marta Rodríguez-Arias; José Miñarro; Roser Martínez-Riera; Marta Torrens; Julie A Chowen; Jesús Argente; Barbara J Mason; Francisco J Pavón; Fernando Rodríguez de Fonseca
Journal: Addict Biol Date: 2014-05-22 Impact factor: 4.280

Beyond simple tests of value: measuring addiction as a heterogeneous disease of computation-specific valuation processes.

From memory to decision-making

Multiple decision-making systems

Beyond simple tests of value

On circuit heterogeneity

Off-line induction of circuit-specific plasticity

On addiction heterogeneity

Steps toward resolving addiction heterogeneity

Conclusion

Review 1. Addiction.

2. Plasma profile of pro-inflammatory cytokines and chemokines in cocaine users under outpatient treatment: influence of cocaine symptom severity and psychiatric co-morbidity.

3. Conflict between place and response navigation strategies: effects on vicarious trial and error (VTE) behaviors.

Review 4. Synaptic mechanisms underlying persistent cocaine craving.

5. Reversal of cocaine-evoked synaptic potentiation resets drug-induced adaptive behaviour.

Review 6. Opiate versus psychostimulant addiction: the differences do matter.

Review 7. The Computational Complexity of Valuation and Motivational Forces in Decision-Making Processes.

Review 8. Pavlovian valuation systems in learning and decision making.

9. Memory and decision making.

10. Prolonged abstinence from cocaine or morphine disrupts separable valuations during decision conflict.

Review 1. Reinforcement learning detuned in addiction: integrative and translational approaches.

2. Kappa-opioid receptor activation reinstates nicotine self-administration in mice.

3. An optimized procedure for robust volitional cocaine intake in mice.