Literature DB >> 21206762

A Neurogenetic Dissociation between Punishment-, Reward-, and Relief-Learning in Drosophila.

Abstract

What is particularly worth remembering about a traumatic experience is what brought it about, and what made it cease. For example, fruit flies avoid an odor which during training had preceded electric shock punishment; on the other hand, if the odor had followed shock during training, it is later on approached as a signal for the relieving end of shock. We provide a neurogenetic analysis of such relief learning. Blocking, using UAS-shibire(ts1), the output from a particular set of dopaminergic neurons defined by the TH-Gal4 driver partially impaired punishment learning, but left relief learning intact. Thus, with respect to these particular neurons, relief learning differs from punishment learning. Targeting another set of dopaminergic/serotonergic neurons defined by the DDC-Gal4 driver on the other hand affected neither punishment nor relief learning. As for the octopaminergic system, the tbh(M18) mutation, compromising octopamine biosynthesis, partially impaired sugar-reward learning, but not relief learning. Thus, with respect to this particular mutation, relief learning, and reward learning are dissociated. Finally, blocking output from the set of octopaminergic/tyraminergic neurons defined by the TDC2-Gal4 driver affected neither reward, nor relief learning. We conclude that regarding the used genetic tools, relief learning is neurogenetically dissociated from both punishment and reward learning. This may be a message relevant also for analyses of relief learning in other experimental systems including man.

Entities: CellLine Chemical Disease Gene Species

Keywords: dopamine; fruit fly; octopamine; olfaction; reinforcement signaling; relief learning

Year: 2010 PMID： 21206762 PMCID： PMC3013555 DOI： 10.3389/fnbeh.2010.00189

Source DB: PubMed Journal: Front Behav Neurosci ISSN： 1662-5153 Impact factor: 3.558

Introduction

Having no idea as to what will happen next is not only bewildering, but can also be dangerous. This is why animals learn about the predictors for upcoming events. For example, a stimulus that had preceded a traumatic event can be learned as a predictor for this event and is later on avoided. Such predictive learning qualitatively depends on the relative timing of events: a stimulus that occurred once a traumatic event had subsided later on supports opposite behavioral tendencies, such as approach, as it signals what may be called relief (Solomon and Corbit, 1974; Wagner, 1981) or safety (Sutton and Barto, 1990; Chang et al., 2003). Such opposing memories about the beginning and end of traumatic experiences are common to distant phyla (e.g., dog: Moskovitch and LoLordo, 1968, rabbit: Plotkin and Oakley, 1975, rat: Maier et al., 1976, snail: Britton and Farley, 1999, adult fruit fly: Tanimoto et al., 2004; Yarali et al., 2008, 2009; Murakami et al., 2010, larval fruit fly: Khurana et al., 2009), including man (Andreatta et al., 2010). This timing-dependency may reflect a universal adaptation to what one may call the “causal texture” of the world, such that whatever precedes X is likely to be the cause of X, and whatever follows X may be responsible for X's disappearance (Dickinson, 2001). Correspondingly, pleasant experiences, too, support opposing kinds of memory for stimuli that respectively precede and follow them (e.g., pigeon: Hearst, 1988; honeybee: Hellstern et al., 1998). Thus, to fully appreciate the behavioral consequences of affective experiences, it is necessary to study the mnemonic effects of their beginning and their end. To do so, the fruit fly offers a fortunate possibility for fine grained behavioral analyses, combined with a small, experimentally accessible brain. Once trained with odor-electric shock pairings, fruit flies avoid this odor as a signal for punishment (Tully and Quinn, 1985); training with a reversed timing of events, that is first shock and then the odor, on the other hand, results in approach toward this odor as a predictor for relief (in adults: Tanimoto et al., 2004; Yarali et al., 2008, 2009; Murakami et al., 2010; in larvae: Khurana et al., 2009). Presenting an odor together with a sugar reward establishes conditioned approach, too (Tempel et al., 1983). Punishment and reward learning are well-studied, including how the respective kinds of reinforcement are signaled. Shock activates a set of fruit fly dopaminergic neurons (Riemensperger et al., 2005), defined by the TH-Gal4 driver; blocking the output from these neurons impairs punishment learning, but not reward learning (in adults: Schwaerzel et al., 2003; Aso et al., 2010; in larvae: Honjo and Furukubo-Tokunaga, 2009; Selcho et al., 2009; regarding the former larval study, Gerber and Stocker (2007) filed caveats which may challenge the associative nature of the used paradigm). Also, loss of function of the dopamine receptor DAMB selectively impairs punishment rather than reward learning in fruit fly larvae (Selcho et al., 2009). Accordingly, in the cricket and the honey bee as well, punishment rather than reward learning is impaired by dopamine receptor antagonists (Unoki et al., 2005, 2006; Vergoz et al., 2007). Finally, activating a set of dopaminergic neurons, defined by the TH-Gal4 driver in adult (Claridge-Chang et al., 2009; Aso et al., 2010) and reportedly also in larval (Schroll et al., 2006) fruit flies substitutes for punishment during training. Altogether, these results point to dopamine as covered by the applied genetic tools, to be necessary and sufficient to signal punishment. As for reward signaling, this reinforcing role seems to be fulfilled by octopamine. In the honeybee, activity of a sugar responsive octopaminergic neuron “VUMmx1,” innervating the olfactory pathway, is sufficient to substitute for the rewarding, but not the reflex-releasing, effects of sugar during training (Hammer, 1993), as does injecting octopamine at various sites along the olfactory pathway (Hammer and Menzel, 1998). In turn, interfering with the honey bee or cricket octopamine receptors impairs reward learning, but leaves punishment learning intact (Farooqui et al., 2003; Unoki et al., 2005, 2006; Vergoz et al., 2007). Accordingly, in the fruit fly, compromising octopamine biosynthesis via the tbh mutation impairs reward learning, but not punishment learning (Schwaerzel et al., 2003; Sitaraman et al., 2010). Finally, in larval fruit flies, the output from a particular set of octopaminergic/tyraminergic neurons, defined by the TDC2-Gal4 driver seems to be required selectively for reward learning (see Honjo and Furukubo-Tokunaga, 2009, but see above); in turn, activating these neurons reportedly substitutes for the reward during training (Schroll et al., 2006). These findings together suggest a double dissociation between the roles of dopamine and octopamine in signaling punishment and reward, respectively. This double dissociation however may need qualification, as the function of the fruit fly dopamine receptor dDA1 turns out to be required for both kinds of learning (in adults: Kim et al., 2007; in larvae: Selcho et al., 2009). The picture becomes more complicated with the additional role of dopaminergic neurons in signaling the state of hunger, which is a determinant for the behavioral expression of the sugar-reward memory in adult fruit flies (Krashes et al., 2009; in other insects, too, octopamine and dopamine affect the behavioral expression of memory, Farooqui et al., 2003; Mizunami et al., 2009; also in crabs: Kaczer and Maldonado, 2009). Finally, in a fruit fly operant place learning paradigm, where high temperature acts as punishment and preferred temperature as potential reward, neither dopamine nor octopamine signaling seems to be critical (Sitaraman et al., 2008, 2010). Thus, the scope of what octopamine and dopamine do for punishment and reward learning, memory, and retrieval remains open, including (except for the seminal case of the VUMmx1 neuron in the bee, Hammer, 1993, and a recent study on dopaminergic signaling in the fly, Aso et al., 2010) the assignment of these putative roles to specific amine-releasing and receiving neurons and the receptors involved, as well as the utility of the genetic tools available. Here, we ask for the neurogenetic bases of relief learning, comparing the underpinnings of relief learning to punishment and reward learning.

Materials and Methods

Flies

Drosophila melanogaster were reared as mass culture at 25°C, 60–70% relative humidity, under a 14:10 h light:dark cycle. We used shibire for temperature-controlled, reversible blockage of synaptic output (Kitamoto, 2001). shibire expression was directed to different sets of neuron by crossing the males of the respective Gal4 strains (Table 1) to females of a UAS-shibire strain (Kitamoto, 2001; first and third chromosomes); thus the offspring were heterozygous for both the Gal4-driver and UAS-shibire. We refer to these flies with the name of the Gal4-driver together with “shi” (e.g., “TH/shi”). To obtain proper genetic controls, we crossed each of the UAS-shibire or the Gal4-driver strains to white flies, thus obtaining flies heterozygous either for the Gal4-driver or for UAS-shibire. We refer to these as, e.g., “TH/+” and “shi/+,” respectively.

Table 1

The Gal4 driver strains that were used.

	Gal4 driver	Gal4 expression in	Chromosome	References
TH	Regulatory sequences of tyrosine hydroxylase gene	Dopaminergic neurons	Third	Friggi-Grelin et al. (2003), Schwaerzel et al. (2003), Riemensperger et al. (2005), Schroll et al. (2006), Zhang et al. (2007), Sitaraman et al. (2008), Claridge-Chang et al. (2009), Honjo and Furukubo-Tokunaga (2009), Krashes et al. (2009), Mao and Davis (2009), Selcho et al. (2009), Aso et al. (2010)
DDC	Regulatory sequences of dopa decarboxylase gene	Dopaminergic/serotonergic neurons	Third	Li et al. (2000), Sitaraman et al. (2008)
TDC2	Regulatory sequences of the neuronal tyrosine decarboxylase gene	Octopaminergic/tyraminergic neurons	Second	Cole et al. (2005), Schroll et al. (2006), Busch et al. (2009), Honjo and Furukubo-Tokunaga (2009), Sitaraman et al. (2010)

Bold font indicates the original report of the respective Gal4 strain.

The Gal4 driver strains that were used. Bold font indicates the original report of the respective Gal4 strain. To approximate the patterns of Gal4 expression, we used the respective drivers (Table 1) to express the UAS-controlled transgene mCD8GFP, which encodes for a green fluorescent protein (GFP) to insert into cellular membranes. To do this, we crossed males from each driver strain to females of a UAS-mCD8GFP strain (Lee and Luo, 1999; second chromosome) and stained the brains of the progeny against the Synapsin protein to visualize the neuropils and against GFP to approximate the pattern of Gal4 expression. Note however that the pattern of GFP-immunoreactivity does not necessarily reflect which neurons would be targeted had another effector, e.g., shibire been expressed using the same Gal4 driver (Ito et al., 2003): first, UAS-mCD8GFP and UAS-shibire may support different levels and patterns of background expression without any Gal4; this background expression then adds up with the driven expression when the Gal4 is present. Second, the level of mCD8GFP expression sufficient for immunohistochemical detection may well be different from the level of shibire expression sufficient to block neuronal output; thus potentially, not all neurons that are visualized by immunohistochemistry may be affected by shibire or vice versa. To test for an effect of an octopamine biosynthesis deficiency, we used the mutant strain tbh (Monastirioti et al., 1996; also see Schwaerzel et al., 2003; Saraswati et al., 2004; Scholz, 2005; Brembs et al., 2007; Certel et al., 2007; Hardie et al., 2007; Sitaraman et al., 2010). These flies have reduced or no octopamine (Monastirioti et al., 1996), due to the deficiency of the tyramine β-hydroxylase enzyme, which catalyzes the last step of octopamine biosynthesis (Figure 2). Since the original tbh strain (Monastirioti et al., 1996) contains an additional mutation in the white gene, we instead used a recombinant strain with a wild-type white+ allele, which was generated by Schwaerzel et al. (2003). As genetic control, we used a non-recombinant strain with wild-type tbh+ and white+ alleles, which was generated in parallel; we refer to this strain simply as “Control.”

Figure 2

Immunohistochemistry

Brains were dissected in saline and fixed for 2 h in 4% formaldehyde with PBST as solvent (phosphate-buffered saline containing 0.3% Triton X-100). After a 1.5 h incubation in blocking solution (3% normal goat serum [Jackson Immuno Research Laboratories Inc., West Grove, PA, USA] in PBST), brains were incubated overnight with the monoclonal anti-Synapsin mouse antibody SYNORF1, diluted 1:20 in PBST (Klagges et al., 1996) and polyclonal anti-GFP rabbit antibody, diluted 1:2000 in PBST (Invitrogen Molecular Probes, Eugene, OR, USA). These primary antibodies were detected after an overnight incubation with Cy3 goat anti-mouse Ig, diluted 1:250 in PBST (Jackson Immuno Research Laboratories Inc., West Grove, PA, USA) and Alexa488 goat anti-rabbit Ig, diluted 1:1000 in PBST (Invitrogen Molecular Probes, Eugene, OR, USA). All incubation steps were followed by multiple PBST washes. Incubations with antibodies were done at 4°C; all other steps were performed at room temperature. Finally, brains were mounted in Vectashield mounting medium (Vector Laboratories Inc., Burlingame, CA, USA) and examined under a confocal microscope (Leica SP1, Leica, Wetzlar, Germany).

Behavioral assays

Flies were collected from fresh food vials and kept for 1–4 days at 18°C and 60–70% relative humidity before experiments. For reward learning as well as for the punishment learning experiments shown in Figures 6B,B′, flies were instead starved overnight for 18–20 h at 25°C and 60–70% relative humidity in vials equipped with a moist tissue paper and a moist filter paper. Those experiments that did not use shibire were performed at 22–25°C and 75–85% relative humidity. For inducing the effect of shibire, flies were first exposed to 34–36°C and 60–70% relative humidity for 30 min; then the experiment took place under these same conditions, which are referred to as “@ high temperature.” The condition referred to as “@ low temperature” in turn involved exposing the flies to 20–23°C and 75–85% relative humidity for 30 min; then the experiment followed also under these conditions.

Figure 6

Compromising octopamine biosynthesis using the . We used the tbh mutant, which has reduced or no octopamine. When the odors 3-octanol (OCT) and 4-methylcyclohexanol (MCH) were used, reward learning was partially impaired (A). Using the odors n-amyl acetate (AM) and isoamyl acetate (IAA) revealed complete lack of reward learning in the tbh mutant (A′). When the odors OCT and benzaldehyde (BA) were used, tbh mutant was intact in reward learning (A′′). A modified punishment learning procedure, which was identical to reward learning, except that the shock pulses were replaced by sugar presentation, revealed no impairment in the tbh mutant, when either the odors OCT and MCH (B) or AM and IAA (B′) were used. Finally, under those conditions for which reward learning of the tbh mutant was partially impaired, i.e., using the odors OCT and MCH, relief learning remained unaffected (C). For this experiment, the odors AM and IAA were not used, as these do not support relief learning (Yarali et al., 2008, loc. cit. Figure 5D). *P < 0.05, NS: P > 0.05, while comparing between genotypes. While comparing scores of each genotype to 0 *P < 0.05/2, NS: P > 0.05/2 (i.e., Bonferroni correction). Sample sizes were from left to right N = 40, 39 in (A), 11, 13 in (A′), 23, 22 in (A′′), 12, 12 in (B), 9, 9 in (B′), and 20, 20 in (C). Box plots are as detailed in Figure 4.

The experimental setup was in principle as described by Tully and Quinn (1985) and Schwaerzel et al. (2003). Flies were trained and tested as groups of 100–150. Trainings took place under dim red light which does not allow flies to see, tests were in complete darkness. As odorants, 90 μl benzaldehyde (BA), 340 μl 3-octanol (OCT), 340 μl 4-methylcyclohexanol (MCH), 340 μl n-amyl acetate (AM) and 340 μl isoamyl acetate (IAA) (CAS 100-52-7, 589-98-0, 589-91-3, 628-63-7, 123-92-2; all from Fluka, Steinheim, Germany) were applied in 1 cm-deep Teflon containers of 5, 14, 14, 14, and 14 mm diameters, respectively. For the experiments in Figures 6A,B,C MCH and OCT were diluted 100-fold in paraffin oil (Merck, Darmstadt, Germany, CAS 8012-95-1), whereas for Figures 6A′,B′, AM and IAA were diluted 36-fold. All other experiments used undiluted BA and OCT. For punishment learning (Figure 1A), flies received six training trials. Each trial started by loading the flies into the experimental setup (0:00 min). From 4:00 min on, the control odor was presented for 15 s. Then, from 7:15 min on, the to-be-learned odor was presented also for 15 s. From 7:30 min on, electric shock was applied as four pulses of 100 V; each pulse was 1.2 s-long and was followed by the next with an onset-to-onset interval of 5 s. Thus the to-be-learned odor preceded shock with an onset-to-onset interval of 15 s. The control odor on the other hand preceded the shock by an onset-to-onset interval of 210 s, which does not result in a measurable association between the two (Tanimoto et al., 2004; Yarali et al., 2008, loc. cit. Figures 1D and 2F, Yarali et al., 2009, loc. cit. Figure 1B). For relief learning (Figure 1B), keeping all other parameters unchanged, we reversed the relative timing of events: that is, the to-be-learned odor was presented from 8:10 min on, thus following shock with an onset-to-onset interval of 40 s. At 12:00 min, flies were transferred out of the setup into food vials, where they stayed for 16 min until the next trial. At the end of the sixth training trial, after the usual 16 min break, flies were loaded back into the setup. After a 5 min accommodation period, they were transferred to the choice point of a T-maze, where they could escape toward either the control odor or the learned odor. After 2 min, the arms of the maze were closed and flies on each side were counted. A preference index (PREF) was calculated as:

Figure 1

Training. For punishment training (A), flies received two odors and pulses of electric shock. A control odor was presented long before shock; a to-be-learned odor preceded shock with an onset-to-onset interval of 15 s. For relief training (B), while all other parameters were unchanged, the to-be-learned odor followed shock with an onset-to-onset interval of 40 s. For reward training (C), flies were successively exposed to a to-be-learned odor in the presence of sugar and then to a control odor without any sugar. Although not shown here, in half of the cases, reward training started with the control odor instead of the to-be-learned odor and sugar. For each kind of training, we used a reciprocal design (D): two groups were trained in parallel; for one of these, e.g., 3-octanol (OCT) was the control odor and benzaldehyde (BA) was to be learned; the other group was trained reciprocally. Each group was then given the choice between the two odors. Based on the flies’ distribution, preference indices (PREF) were calculated. Based on the two reciprocal PREF values, we calculated a learning index (LI). The situation is sketched for punishment learning, but also applies to relief and reward learning. # indicates the number of flies found in the respective maze-arm. Two groups of flies were trained and tested in parallel (Figure 1D). For one of these, e.g., 3-octanol (OCT) was the control odor and BA was to be learned; the second group was trained reciprocally. PREFs from the two reciprocal measurements were then averaged to obtain a final learning index (LI): Subscripts of PREF indicate the learned odor in the respective training. Positive LIs indicate conditioned approach to the learned odor; negative values reflect conditioned avoidance. Reward learning (Figure 1C) used two training trials. Each trial started by loading the flies into the setup (0:00 min). One minute later, flies were transferred to a tube lined with a filter paper which was soaked the previous day with 2 ml of 2 M sucrose solution, and then was left to dry over night. This tube was scented with the to-be-learned odor. After 45 s, the to-be-learned odor was removed, and after 15 additional seconds flies were taken out of the tube. At the end of a 1 min waiting period, they were transferred into another tube lined with a filter paper which was soaked with pure water and then dried. This second tube was scented with the control odor. After 45 s, control odor was removed and 15 s later, flies were taken out of this second tube. The next trial started immediately. This transfer between the two kinds of tube during training should prevent the learning of an association between the control odor and the sugar. For half of the cases, training trials started with the to-be-learned odor and sugar; in the other half, control odor was given precedence. Once the training was completed, after a 3 min waiting period, flies were transferred to the choice point of a T-maze between the control odor and the learned odor. After 2 min, the arms of the maze were closed, flies on each side were counted and a preference index (PREF) was calculated according to Eq. 1. As detailed above (also see Figure 1D), two groups were trained reciprocally and the LI was calculated based on their PREF values according to Eq. 2. Finally, a modified punishment training procedure (not shown in Figure 1) imitated the reward learning as in Figure 1C, but sugar presentation was replaced by 12 pulses of 100 V electric shock, each lasting 1.2 s and separated by an onset-to-onset interval of 5 s.

Statistics

All data were analyzed using non-parametric statistics and are reported as box plots, showing the median as the midline and 10, 90, and 25, 75% as whiskers and box boundaries, respectively. For comparing scores of individual groups to 0, we used one-sample sign tests. Mann–Whitney U-tests and Kruskal–Wallis tests were used for pair-wise and global between-group comparisons, respectively. When multiple tests of one kind were performed within a single experiment, we adjusted the experiment-wide error-rate to 5% by Bonferroni correction: we divided the critical P < 0.05 by the number of tests. One-sample sign tests were done using a web-based tool (http://www.fon.hum.uva.nl/Service/Statistics/Sign_Test.html). All other statistical analyses were performed with the software Statistica (Statsoft, Tulsa, OK, USA). Sample sizes are reported in the figure legends.

Results

Blocking output from two different sets of dopaminergic neurons

First, we compared relief learning to punishment learning in terms of the roles of dopaminergic neurons. We confirmed that blocking the output from a particular set of dopaminergic neurons, using the temperature-sensitive UAS-shibire in combination with the TH-Gal4 driver (Friggi-Grelin et al., 2003, Table 1; Figures 2 and 3A), impairs punishment learning: when trained and tested at high temperature, TH/shi flies showed less negative learning scores than the genetic controls (Figure 4A @ high temperature: Kruskal–Wallis test: H = 11.44, d.f. = 2, P < 0.05). This impairment in punishment learning, however, was obviously partial in the TH/shi flies (Figure 4A @ high temperature: one-sample sign tests: P < 0.05/3 for each genotype), as was the case in previous studies (Schwaerzel et al., 2003; Aso et al., 2010). This residual learning ability may be due to incomplete coverage of dopaminergic neurons by the TH-Gal4 driver (Friggi-Grelin et al., 2003; Sitaraman et al., 2008; Claridge-Chang et al., 2009; Mao and Davis, 2009; see the Discussion for details) and/or to an incomplete block of neuronal output by shibire. At low temperature, as shibire was benign, TH/shi flies performed comparably to the genetic controls in punishment learning (Figure 4A @ low temperature: Kruskal–Wallis test: H = 2.06, d.f. = 2, P = 0.36).

Figure 3

Approximated patterns of Gal4 expression by the used drivers. We drove the expression of a membrane bound green fluorescent protein (mCD8GFP) using three different Gal4 drivers. Patterns of GFP-immunoreactivity (green) should approximate the respective patterns of Gal4-expression; Synapsin-immunoreactivity (magenta) shows the organization of the neuropils. We display projections of frontal optical sections of 0.9 μm, each. In each row, the leftmost panel shows the anterior-most projection; in each panel, dorsal is to the top. When driven by TH-Gal4 (A), GFP was expressed in neurons that innervate the mushroom body vertical lobes and peduncles (left and middle panels) as well as the fan-shaped body (middle panel) and the protocerebral bridge (right panel). We found no innervation of the antennal lobes or the mushroom body calyces (but see Mao and Davis, 2009). Under the control of the DDC-Gal4 driver (B), GFP was expressed in neurons that innervate the subesophageal ganglion (left and middle panels) as well as the horizontal lobes of the mushroom body (right; see also the inset). Neurons that express GFP, driven by TDC2-Gal4 (C) innervated the antennal lobes (left panel), mushroom body γ-lobes and their spurs (left panel, inset), the subesophageal ganglion (left and middle panels), the areas surrounding the esophagus (middle panel), and the mushroom body calyces (right panel; see also the inset).

Figure 4

Targeting a set of dopaminergic neurons, using the TH-Gal4 driver. We expressed shibire in the set of dopaminergic neurons defined by the TH-Gal4 driver. Punishment learning was partially impaired at high temperature (A, left), but not at low temperature (A, right). Contrarily, relief learning remained unaffected even at high temperature (B). *P < 0.05 and NS: P > 0.05 while comparing between genotypes. While comparing scores of each genotype to 0 *P < 0.05/3, to keep the experiment-wide error-rate at 5% (i.e., Bonferroni correction). Sample sizes were N = 8, each in (A) and 13, each in (B). Box plots show the median as the midline; 25 and 75% as the box boundaries and 10 and 90% as whiskers.

Biosynthesis of dopamine, tyramine, octopamine, and serotonin. DDC, dopa decarboxylase; TβH, tyramine β-hydroxylase; TDC, tyrosine decarboxylase; TH, tyrosine hydroxylase; TPH, tryptophan hydroxylase. Modified from Monastirioti (1999). Approximated patterns of Gal4 expression by the used drivers. We drove the expression of a membrane bound green fluorescent protein (mCD8GFP) using three different Gal4 drivers. Patterns of GFP-immunoreactivity (green) should approximate the respective patterns of Gal4-expression; Synapsin-immunoreactivity (magenta) shows the organization of the neuropils. We display projections of frontal optical sections of 0.9 μm, each. In each row, the leftmost panel shows the anterior-most projection; in each panel, dorsal is to the top. When driven by TH-Gal4 (A), GFP was expressed in neurons that innervate the mushroom body vertical lobes and peduncles (left and middle panels) as well as the fan-shaped body (middle panel) and the protocerebral bridge (right panel). We found no innervation of the antennal lobes or the mushroom body calyces (but see Mao and Davis, 2009). Under the control of the DDC-Gal4 driver (B), GFP was expressed in neurons that innervate the subesophageal ganglion (left and middle panels) as well as the horizontal lobes of the mushroom body (right; see also the inset). Neurons that express GFP, driven by TDC2-Gal4 (C) innervated the antennal lobes (left panel), mushroom body γ-lobes and their spurs (left panel, inset), the subesophageal ganglion (left and middle panels), the areas surrounding the esophagus (middle panel), and the mushroom body calyces (right panel; see also the inset). Targeting a set of dopaminergic neurons, using the TH-Gal4 driver. We expressed shibire in the set of dopaminergic neurons defined by the TH-Gal4 driver. Punishment learning was partially impaired at high temperature (A, left), but not at low temperature (A, right). Contrarily, relief learning remained unaffected even at high temperature (B). *P < 0.05 and NS: P > 0.05 while comparing between genotypes. While comparing scores of each genotype to 0 *P < 0.05/3, to keep the experiment-wide error-rate at 5% (i.e., Bonferroni correction). Sample sizes were N = 8, each in (A) and 13, each in (B). Box plots show the median as the midline; 25 and 75% as the box boundaries and 10 and 90% as whiskers. Importantly, blocking output from TH-Gal4 neurons, a treatment which did impair punishment learning, left relief learning intact: with training and test at high temperature, we found relief learning scores of TH/shi flies to be indistinguishable from the genetic controls (Figure 4B @ high temperature: Kruskal–Wallis test: H = 0.10, d.f. = 2, P = 0.96). Accordingly pooling the data, we found conditioned approach (Figure 4B @ high temperature: one-sample sign test for the pooled data set: P < 0.05). One might argue that the generally low relief learning scores may not allow detecting a possible partial impairment due to neurogenetic intervention. This however does not apply to Figure 4B, as relief learning in the TH/shi flies does not even tend to be inferior to the genetic controls (similarly, see Figures 5B, 6C, and 7B). We note that punishment and relief learning procedures differ only with respect to the timing of the to-be-learned odor during training; otherwise they entail the same handling and stimulus–exposure. Therefore, intact relief learning in the TH/shi flies (Figure 4B) excludes sensory and/or motor problems as potential cause for the impairment in punishment learning (Figure 4A, left).

Figure 5

Figure 7

Targeting a set of octopaminergic/tyraminergic neurons, using the TDC2-Gal4 driver. We expressed shibire in the set of octopaminergic/tyraminergic neurons defined by the TDC2-Gal4 driver. At high temperature, neither reward learning (A) nor relief learning (B) was impaired. NS: P > 0.05, while comparing between genotypes. Sample sizes were from left to right N = 24, 27, 27 in (A) and 11, each in (B). Box plots are as detailed in Figure 4.

Targeting a set of dopaminergic/serotonergic neurons, using the DDC-Gal4 driver. We expressed shibire in the set of dopaminergic/serotonergic neurons defined by the DDC-Gal4 driver. At high temperature, neither punishment learning (A), nor relief learning (B) was affected. NS: P > 0.05, while comparing between genotypes. Sample sizes were from left to right N = 13, 11, 12 in (A) and 12, 11, 12 in (B). Box plots are as detailed in Figure 4. Compromising octopamine biosynthesis using the . We used the tbh mutant, which has reduced or no octopamine. When the odors 3-octanol (OCT) and 4-methylcyclohexanol (MCH) were used, reward learning was partially impaired (A). Using the odors n-amyl acetate (AM) and isoamyl acetate (IAA) revealed complete lack of reward learning in the tbh mutant (A′). When the odors OCT and benzaldehyde (BA) were used, tbh mutant was intact in reward learning (A′′). A modified punishment learning procedure, which was identical to reward learning, except that the shock pulses were replaced by sugar presentation, revealed no impairment in the tbh mutant, when either the odors OCT and MCH (B) or AM and IAA (B′) were used. Finally, under those conditions for which reward learning of the tbh mutant was partially impaired, i.e., using the odors OCT and MCH, relief learning remained unaffected (C). For this experiment, the odors AM and IAA were not used, as these do not support relief learning (Yarali et al., 2008, loc. cit. Figure 5D). *P < 0.05, NS: P > 0.05, while comparing between genotypes. While comparing scores of each genotype to 0 *P < 0.05/2, NS: P > 0.05/2 (i.e., Bonferroni correction). Sample sizes were from left to right N = 40, 39 in (A), 11, 13 in (A′), 23, 22 in (A′′), 12, 12 in (B), 9, 9 in (B′), and 20, 20 in (C). Box plots are as detailed in Figure 4. Targeting a set of octopaminergic/tyraminergic neurons, using the TDC2-Gal4 driver. We expressed shibire in the set of octopaminergic/tyraminergic neurons defined by the TDC2-Gal4 driver. At high temperature, neither reward learning (A) nor relief learning (B) was impaired. NS: P > 0.05, while comparing between genotypes. Sample sizes were from left to right N = 24, 27, 27 in (A) and 11, each in (B). Box plots are as detailed in Figure 4. Next, we used an independent driver, DDC-Gal4 (Li et al., 2000; Table 1; Figures 2 and 3B), to express UAS-shibire in a set of dopaminergic/serotonergic neurons. Blocking the output from these neurons left punishment learning unaffected: when trained and tested at high temperature, DDC/shi flies showed learning scores comparable to the genetic controls (Figure 5A @ high temperature: Kruskal–Wallis test: H = 2.14, d.f. = 2, P = 0.34). Thus pooling the scores across genotypes, we observed conditioned avoidance (Figure 5A @ high temperature: one-sample sign test for the pooled data set: P < 0.05). This lack of effect on punishment learning may be caused by (i) the DDC-Gal4 driver not covering all dopaminergic neurons; (ii) incomplete overlap to those dopaminergic neurons targeted by the TH-Gal4 (Sitaraman et al., 2008; Claridge-Chang et al., 2009; Mao and Davis, 2009; see the Discussion for details), (iii) incomplete block of synaptic output by shibire; (iv) a dominant-negative effect of DDC-Gal4, which is non-additive with the effect of shibire expression in these neurons (see below). In any case, we probed for an effect of blocking output from the DDC-Gal4 neurons on relief learning and found none: after training and test at high temperature, learning scores were not different between genotypes (Figure 5B @ high temperature: Kruskal–Wallis test: H = 1.24, d.f. = 2, P = 0.54). We thus pooled the data and found weak yet significant conditioned approach (Figure 5B @ high temperature: one-sample sign test for the pooled data set: P < 0.05). We note that the DDC/+ flies tended to show less pronounced punishment and relief learning when compared to the TH/+ flies (compare Figure 4 versus Figure 5) as well as when compared to the shi/+ flies (Figure 5). In the case of punishment learning, as we used a Kruskal–Wallis test across all three experimental groups, this effect of the DDC-Gal4 driver construct may have obscured an actual effect of blocking the output from DDC-Gal4-targeted neurons (compare shi/+ to DDC/shi in Figure 5A). For relief learning, however, no corresponding trend is noted (compare shi/+ to DDC/shi in Figure 5B). In any case, with respect to the role of the neurons defined by DDC-Gal4, our results do not offer an argument to dissociate punishment from relief learning. To summarize, concerning the neurons defined by TH-Gal4, we found a clear dissociation between punishment and relief learning (Figure 4), while for the DDC-Gal4 neurons the situation remains inconclusive (Figure 5). We would like to stress that this does not at all exclude a role for the dopaminergic system in relief learning, given that first, in neither experiment did we cover all dopaminergic neurons at once, and second, as a general concern, blockage of neuronal output by shibire may well be incomplete (see the Discussion for details).

Compromising octopamine biosynthesis

Next, we compared relief learning to reward learning in terms of the role of octopamine. We first confirmed that compromising octopamine biosynthesis via the tbh mutation in the key enzyme tyramine β-hydroxylase (Monastirioti et al., 1996; Figure 2) impairs reward learning: after odor-sugar training, using the odors 3-octanol (OCT) and 4-methylcyclohexanol (MCH), the tbh mutant showed significantly less conditioned approach than the genetic Control (Figure 6A: U-test: U = 544.00, P < 0.05). Residual reward learning ability was however detectable in the tbh mutant (Figure 6A: one-sample sign tests: P < 0.05/2 for each genotype). This contrasts to the report of Sitaraman et al. (2010), who had shown a complete loss of reward learning using the same odors; the discrepancy may be due to the different genetic backgrounds used in the two studies (i.e., the present study uses the strains from Schwaerzel et al., 2003, whereas Sitaraman et al., 2010 uses those from Certel et al., 2007). Schwaerzel et al. (2003) found no reward learning ability in the tbh mutant, using the odors ethyl acetate and isoamyl acetate (IAA); indeed, using n-amyl acetate (AM) and IAA as odors, we also found a complete loss of reward learning in the tbh mutant (Figure 6A′: U-test: U = 33.00, P < 0.05; one-sample sign tests: P < 0.05/2 for Control, and P = 0.58 for the tbh mutant). Surprisingly however, when the odors OCT and benzaldehyde (BA) were used, tbh mutant flies showed fully intact reward learning (Figure 6A′′: U-test: U = 204.50, P = 0.27; one-sample sign test for the pooled data set: P < 0.05). This lack of effect in Figure 6A′′ should not be due to the relatively low learning indices of the Control flies, since in Figure 6A, we could detect even a partial effect of the tbh mutation despite such low Control scores. Note that using the present two-odor reciprocal training design (Figure 1D), the contribution of each odor to the LI, and hence the question whether the tbh mutation affects learning about any one given odor but not the other, remains unresolved. We can however conclude that the reward learning impairment of the tbh mutant can be partial, complete, or absent, depending on the combination of odors used and likely also on the genetic background; this suggests residual octopaminergic function and/or an octopamine-independent compensatory mechanism (see the Discussion for details). To test for an effect of the tbh mutation on punishment learning, we used a modified training, which entailed the same pre-starvation, handling, and stimulus–exposure as reward learning, except the sugar presentation was replaced by shock pulses. In such modified punishment learning, the tbh mutant performed comparably to the genetic Control, using either the odors OCT and MCH (Figure 6B: U-test: U = 47.00, P = 0.15; one-sample sign test for the pooled data set: P < 0.05) or AM and IAA (Figure 6B′: U-test: U = 38.00, P = 0.82; one-sample sign test for the pooled data set: P < 0.05). Thus, confirming Schwaerzel et al. (2003), we can conclude that reward and punishment learning are dissociated in terms of the effect of the tbh mutation. In addition, normal performance of the tbh mutant in this modified punishment learning makes deficiencies in odor perception or motor control unlikely as causes for the reward learning impairment (Figures 6A,A′). In order to test for an effect of the tbh mutation on relief learning, we used the odors OCT and MCH, because the odors AM and IAA do not support relief learning (Yarali et al., 2008, loc. cit. Figure 5D). Under conditions for which the tbh mutant did show a reward learning impairment, however partial (i.e., using the odors OCT and MCH), relief learning ability remained unaffected: learning scores were statistically indistinguishable between genotypes (Figure 6C: U-test: U = 168.00, P = 0.40), with no apparent trend for lower scores in the tbh mutant. We thus pooled the data and found weak yet significant conditioned approach (Figure 6C: one-sample sign test for the pooled data set: P < 0.05).

Blocking the output from a set of octopaminergic/tyraminergic neurons

As an additional, independent assault toward the octopaminergic system, we blocked the output from a set of octopaminergic/tyraminergic neurons, using UAS-shibire, in combination with the TDC2-Gal4 driver (Cole et al., 2005; Table 1; Figures 2 and 3C). We first tested for an effect on reward learning: when trained and tested at high temperature, TDC2/shi flies performed comparably to the genetic controls (Figure 7A @ high temperature: Kruskal–Wallis test: H = 3.03, d.f. = 2, P = 0.22). Accordingly pooling the learning scores across genotypes, we found conditioned approach (Figure 7A @ high temperature: one-sample sign test for the pooled data set: P < 0.05). This lack of effect on reward learning may be because the TDC2-Gal4 driver does not target all octopaminergic neurons (Busch et al., 2009; see the Discussion for details) and/or the output from the targeted neurons is not completely blocked by the shibire. Nevertheless, we probed for an effect on relief learning and found none: after training and test at high temperature, learning scores were statistically indistinguishable between genotypes (Figure 7B @ high temperature: Kruskal–Wallis test: H = 2.43, d.f. = 2, P = 0.30). Accordingly pooling the data, we found conditioned approach (Figure 7B @ high temperature: one-sample sign test for the pooled data set: P < 0.05). To summarize, while reward and relief learning are apparently dissociated when considering the tbh mutant, we can put no distinction between these two kinds of learning in terms of the role of the neurons covered by the TDC2-Gal4 driver. Again, this does not rule out a role for the octopaminergic system in relief learning, as these conclusions refer only to the specific genetic manipulations used.

Discussion

We compared relief learning to both punishment learning and reward learning, focusing on the involvement of aminergic modulation by dopamine and octopamine. As previously reported (Schwaerzel et al., 2003; Aso et al., 2010), directing the expression of UAS-shibire to a particular set of dopaminergic neurons defined by the TH-Gal4 driver partially impaired punishment learning (Figure 4A). Relief learning however was left intact (Figure 4B). Expressing UAS-shibire with another driver, DDC-Gal4, on the other hand affected neither punishment nor relief learning (Figure 5). All dopaminergic neuron clusters in the fly brain are targeted by the TH-Gal4 driver; some clusters however, are covered only partially, e.g., 80–90% of the anterior medial “PAM cluster” neurons are left out (Friggi-Grelin et al., 2003; Sitaraman et al., 2008; Claridge-Chang et al., 2009; Mao and Davis, 2009). Contrarily, the DDC-Gal4 driver, along with serotonergic neurons, likely targets most of the PAM cluster dopaminergic neurons, while possibly leaving out dopaminergic neurons in other clusters (Sitaraman et al., 2008; Figure 3B). In a mixed classical-operant olfactory punishment learning task, Claridge-Chang et al. (2009) found no impairment upon blocking the activity of most PAM cluster neurons with an inwardly rectifying K+ channel (UAS-kir2.1), driven by HL9-Gal4. Although relying on both a different Gal4 driver and a different effector, this result is in agreement with the intact punishment learning we found when expressing UAS-shibire with the DDC-Gal4 driver (Figure 5A). Thus, as far as short-term punishment learning is concerned, there is so far no evidence for a role for the PAM cluster neurons (for middle-term punishment learning, see Aso et al., 2010). Nevertheless, targeting the remaining dopaminergic neuron clusters by the TH-Gal4 driver only partially impairs punishment learning (Schwaerzel et al., 2003; Aso et al., 2010; Figure 4A). Conceivably, the TH-Gal4 driver may leave out few dopaminergic neurons in clusters other than PAM; these may then carry a punishment signal, redundant to that carried by the TH-Gal4-targeted neurons. This scenario would readily accommodate Schroll et al.’s (2006) report that activity of the TH-Gal4-targeted neurons in larval fruit flies substitutes for punishment. The intact relief learning upon expressing UAS-shibire with TH-Gal4 can also be explained by this scenario. Alternatively, the level of shibire expression driven by TH-Gal4 may fall short of effectively blocking the neuronal output required for relief learning, and/or an additional, shibire-resistant neurotransmission mechanism may be employed in relief learning. Further, if punishment were to be signaled by a shock-induced increase in the activity of the TH-Gal4 neurons and relief was to be signaled by a decrease in their activity below the baseline at the shock offset, incomplete blockage of output from these neurons could partially impair punishment learning, while leaving relief learning intact. In face of these caveats, we find it too early to exclude any role of dopamine or of the TH-Gal4 neurons. What then is a safe minimal conclusion? Given that while punishment learning is partially impaired (Figure 4A) relief learning does not even tend to be impaired (Figure 4B), these two kinds of learning do differ in terms of whether and which role the TH-Gal4-covered neurons play. This does dissociate punishment and relief learning in terms of their underlying mechanisms. Turning to the octopaminergic system, we confirmed Schwaerzel et al. (2003) in that the tbh mutant with compromised octopamine biosynthesis is impaired in reward learning (Figures 6A,A′), but not in punishment learning (Figures 6B,B′). The effect on reward learning was however conditional on the kinds of odor used (Figures 6A,A′,A′′). Under the conditions that significantly impaired reward learning, we found relief learning intact (Figure 6C). Although the tbh mutant we used revealed no octopamine content in immunohistochemical and high pressure liquid chromatography (HPLC) analyses (Monastirioti et al., 1996), it may retain an amount of octopamine below the detection thresholds of these methods but sufficient to signal reward and/or relief. Furthermore, HPLC analysis reveals a ∼10-fold increase in the amount of octopamine-precursor tyramine in this mutant (Monastirioti et al., 1996); this excessive tyramine may compensate for the lack of octopamine (Uzzan and Dudai, 1982). As an additional approach, we blocked the output from a set of octopaminergic/tyraminergic neurons, expressing UAS-shibire with the TDC2-Gal4 driver; this impaired neither reward, nor relief learning (Figure 7). The TDC2-Gal4 driver targets, along with tyraminergic neurons, octopaminergic neurons in three paired and one unpaired neuron clusters (Busch et al., 2009). Among these, the unpaired “VM cluster” harbors octopaminergic neurons innervating on the one hand the subesophageal ganglion (SOG), and on the other hand the antennal lobes, mushroom bodies, and the lateral horn (Busch et al., 2009); such connectivity would enable signaling gustatory reward onto the olfactory pathway. Indeed, in the honey bee, activation of a single octopaminergic neuron, VUMmx1, with such innervation pattern, is sufficient to carry the reward signal for olfactory learning (Hammer, 1993). Surprisingly however, although all octopaminergic neurons in the VM cluster are targeted by the TDC2-Gal4 (Busch et al., 2009), using this driver with UAS-shibire, we found reward learning intact (Figure 7A). This may be because the level UAS-shibire expression falls short of completely blocking the neuronal output. Alternatively, given that activation of the TDC2-Gal4-targeted neurons in fruit fly larvae reportedly substitutes for reward (Schroll et al., 2006), the VM cluster neurons may indeed carry a reward signal, but other octopaminergic neurons outside this cluster, left out by the TDC2-Gal4 driver (Busch et al., 2009) may redundantly do so. Either kind of argument could also explain the lack of effect on relief learning (Figure 7B). Thus, although we find no evidence for a role for the octopaminergic system in relief learning, we refrain from excluding such a role. Still, given that the tbh mutation affects reward learning, but not relief learning, these two forms of learning are to some extent dissociated in their genetic requirements. Obviously, the question whether dopaminergic and octopaminergic systems are involved in relief learning remains open. Follow up studies should extend our neurogenetic approach to further tools. For example, dopamine biosynthesis can be specifically compromised in the fly nervous system using a tyrosine hydroxylase mutant in combination with a hypoderm-specific rescue construct (Hirsh et al., 2010). Also, for two different dopamine receptors, DAMB and dDA-1, loss of function mutations are available (Kim et al., 2007; Selcho et al., 2009). Notably, by means of the dDA-1 receptor loss of function mutant, the role of the dopaminergic system in reward learning was revealed (Kim et al., 2007; Selcho et al., 2009), which had been overlooked with the tools used in the present study. In addition, a pharmacological approach would be useful. Antagonists for the vertebrate D1 and D2 receptors have been successfully used in the fruit fly (Yellman et al., 1997; Seugnet et al., 2008) and other insects (Unoki et al., 2005, 2006; Vergoz et al., 2007) (regarding the octopamine receptors: Unoki et al., 2005, 2006; Vergoz et al., 2007). Such pharmacological approach could be extended to other aminergic, as well as peptidergic systems and could also test for the effects of human psychotherapeuticals. The results of such studies may then guide subsequent analyses at the cellular level. To summarize, while this study has shed no light on how relief learning works, it did show that relief learning works in a way neurogenetically different from both punishment learning and reward learning, likely at the level of the roles of aminergic neurons. Interestingly, at this level also punishment and reward learning are dissociated. However, all three kinds of learning also share genetic commons, for example with respect to the role of the synapsin gene, likely critical for neuronal plasticity (Godenschwege et al., 2004; Michels et al., 2005; Knapek et al., 2010; T. Niewalda, Universität Würzburg, personal communication). Thus, punishment-, relief-, and reward-learning may conceivably rely on common molecular mechanisms of memory trace formation, which however are triggered by experimentally dissociable reinforcement signals, and/or operate in distinct neuronal circuits. This may be a message relevant also for analyses of relief learning in other experimental systems, including rodent (Rogan et al., 2005), monkey (Tobler et al., 2003; Belova et al., 2007; Matsumoto and Hikosaka, 2009), and man (Seymour et al., 2005; Andreatta et al., 2010).

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

64 in total

1. Backward conditioning: mediation by the context.

Authors: Raymond C Chang; Aaron P Blaisdell; Ralph R Miller
Journal: J Exp Psychol Anim Behav Process Date: 2003-07

2. Coding of predicted reward omission by dopamine neurons in a conditioned inhibition paradigm.

Authors: Philippe N Tobler; Anthony Dickinson; Wolfram Schultz
Journal: J Neurosci Date: 2003-11-12 Impact factor: 6.167

3. Ectopic G-protein expression in dopamine and serotonin neurons blocks cocaine sensitization in Drosophila melanogaster.

Authors: H Li; S Chaney; I J Roberts; M Forte; J Hirsh
Journal: Curr Biol Date: 2000-02-24 Impact factor: 10.834

4. Behavioral and neural bases of noncoincidence learning in Hermissenda.

Authors: G Britton; J Farley
Journal: J Neurosci Date: 1999-10-15 Impact factor: 6.167

5. Writing memories with light-addressable reinforcement circuitry.

Authors: Adam Claridge-Chang; Robert D Roorda; Eleftheria Vrontou; Lucas Sjulson; Haiyan Li; Jay Hirsh; Gero Miesenböck
Journal: Cell Date: 2009-10-16 Impact factor: 41.582

6. Odour avoidance learning in the larva of Drosophila melanogaster.

Authors: Sukant Khurana; Mohammed Bin Abu Baker; Obaid Siddiqi
Journal: J Biosci Date: 2009-10 Impact factor: 1.826

7. Roles of dopamine in circadian rhythmicity and extreme light sensitivity of circadian entrainment.

Authors: Jay Hirsh; Thomas Riemensperger; Hélène Coulom; Magali Iché; Jamie Coupar; Serge Birman
Journal: Curr Biol Date: 2010-01-21 Impact factor: 10.834

8. Modulation of Drosophila male behavioral choice.

Authors: Sarah J Certel; Mary Grace Savella; Dana C F Schlegel; Edward A Kravitz
Journal: Proc Natl Acad Sci U S A Date: 2007-03-05 Impact factor: 11.205

9. Trace amines differentially regulate adult locomotor activity, cocaine sensitivity, and female fertility in Drosophila melanogaster.

Authors: Shannon L Hardie; Jing X Zhang; Jay Hirsh
Journal: Dev Neurobiol Date: 2007-09-01 Impact factor: 3.964

10. Roles of octopaminergic and dopaminergic neurons in appetitive and aversive memory recall in an insect.

Authors: Makoto Mizunami; Sae Unoki; Yasuhiro Mori; Daisuke Hirashima; Ai Hatano; Yukihisa Matsumoto
Journal: BMC Biol Date: 2009-08-04 Impact factor: 7.431

14 in total

1. Drosophila mutants lacking octopamine exhibit impairment in aversive olfactory associative learning (Commentary on Iliadi et al. (2017)).

Authors: Timothy J Mosca
Journal: Eur J Neurosci Date: 2017-08-02 Impact factor: 3.386