Literature DB >> 29246977

Temporal specificity in Pavlovian-to-instrumental transfer.

Matthew S Matell1, Rebecca B Della Valle1.   

Abstract

Presentation of a previously trained Pavlovian conditioned stimulus while an organism is engaged in operant responding can moderate the rate of responding, a phenomenon known as Pavlovian-to-instrumental transfer. Although it is well known that Pavlovian contingencies will generate conditioned behavior that is temporally organized with respect to the arrival of the predicted outcome, little work has examined the temporal dynamics of responding during Pavlovian-instrumental transfer. We trained rats using a fixed time 60-sec, fixed time 120-sec, or random time 60-sec schedule in an appetitive Pavlovian task, and found that presentation of the conditioned stimulus potentiated operant responding in a manner that reflected these previously established temporal expectancies. Further, this temporal specificity conformed to the scalar property as seen with other forms of interval timing behavior. Surprisingly, this effect was only seen when the conditioned stimulus was a visual cue, but not when it was an auditory cue. These data suggest that the motivational processes triggered by Pavlovian cues are not static, but fluctuate in strength as a function of temporally specific expectations of reward.
© 2018 Matell and Della Valle; Published by Cold Spring Harbor Laboratory Press.

Entities:  

Mesh:

Year:  2017        PMID: 29246977      PMCID: PMC5733466          DOI: 10.1101/lm.046383.117

Source DB:  PubMed          Journal:  Learn Mem        ISSN: 1072-0502            Impact factor:   2.460


Investigating the influence that reward predicting cues have on cognitive and motivational processes is frequently carried out by assessing the magnitude of Pavlovian-instrumental transfer (PIT) (for a recent review, see Cartoni et al. 2016). In the appetitive version of this task, subjects are trained under a Pavlovian contingency that an initially neutral stimulus (e.g., an auditory or visual signal) predicts the subsequent delivery of an unconditioned stimulus (e.g., a food reward). After sufficient pairings, subjects will typically show a conditioned response (e.g., approach to a food magazine) in response to the conditioned stimulus (CS+). Independently, subjects are also given operant training in which reinforcement (e.g., the same food reward) is delivered following a behavioral response (e.g., a lever press), until responding is reliably generated. Subsequently, when tested under extinction conditions, the presentation of the CS+ potentiates the rate of operant responding when compared with both the baseline rate and a previously presented, but unreinforced, cue (CS−). The increase in responding is attributed to the activation of conditioned motivation and/or expectancy of reward from the CS+ (Konorski 1967; Rescorla and Solomon 1967; Bindra 1974), thereby invigorating habitual and/or goal-directed behavior (Balleine and Dickinson 1998). Continued investigation of PIT has revealed that at least two different processes contribute to these motivational effects (for a recent review, see Cartoni et al. 2016). If the presentation of a Pavlovian CS+ potentiates operant responding, irrespective of whether the reward(s) trained during the Pavlovian phase is the same as that trained during the operant phase (Holland 2004), the process is deemed to be “general” PIT, and has been linked to broad incentive-motivational processes (Bindra 1974). Enhanced responding in general PIT is therefore suggested to reflect the hedonic value of the reward, irrespective of its specific sensory qualities, and results from conditioned activation of a motivational/emotional state. In contrast, when subjects are trained that different operant responses result in different outcomes, the presentation of a Pavlovian cue that signals one of these outcomes will potentiate responding for that outcome, at levels significantly greater than what is evoked when responding for the other outcome (Kruse et al. 1983; Colwill and Rescorla 1990). This “sensory-specific” PIT is thought to be mediated by activation of a representation of the expected sensory composition or quality of the outcome, which is associated with only one of the actions. Surprisingly, under these latter conditions, the general “goodness” of the outcome tends to have a diminished capacity to facilitate responding. We have known for over 60 years that classical and operant conditioning results in behavioral dynamics that reflect the specific temporal interval between the predictive signal and the predicted outcome (Pavlov 1927; Ferster and Skinner 1957). This temporal knowledge is best demonstrated by incorporating occasional long duration probe trials into the conditioning procedure, thereby allowing measurement of both the onset and termination of responding. For example, Roberts (1981) trained rats on an operant temporal production task, commonly referred to as a peak-interval procedure. In this work, rats were exposed to a discrete trials fixed-interval schedule of reinforcement in which the first operant response following the passage of a criterion interval (i.e., 40 sec) was reinforced with food delivery, while responses prior to this duration had no programmed consequence. On other “probe” trials, the discriminative stimulus was presented, and remained on for several times the fixed interval, before terminating without reinforcement in a response independent manner. Plots of the average response rate on these probe trials demonstrated that responding increased and then decreased in a temporally specific, peak-shaped, manner, such that it was maximal at the trained interval. Similar findings have also been demonstrated for Pavlovian conditioning. For instance, Drew et al. (2005) demonstrated that peak-shaped responding which reflected the CS–US interval appeared as soon as conditioned responding emerged, and that additional training simply sharpened the temporal gradient. Subsequent work has demonstrated that these smooth peak-shaped mean functions are an artifact of averaging across trials, and that the response form on single trials is well characterized by a step function in which responding begins and ends abruptly, typically bracketing the fixed interval (Cheng and Westwood 1993; Church et al. 1994; Matell et al. 2006; Taylor et al. 2007; Balci et al. 2009). Surprisingly, despite our knowledge about the temporal dynamics of conditioned behavior, very few investigations have explored the impact of temporal expectancy on Pavlovian-instrumental transfer. An early demonstration of temporally moderated transfer comes from Rescorla (1967) who trained dogs in a Sidman avoidance task to move from one side of a shuttle box to the other in order to avoid shock. He separately trained the dogs with a Pavlovian contingency that a tone predicted shock after 30 sec. During test, he presented the tone and found that the dogs displayed a temporal sequence of diminished shuttle box responding early in the CS followed by increased responding as time in the CS approached the time at which the US had previously occurred. These data suggest that PIT is not static, and may reflect previously acquired temporal information. However, as the CS presentation at test was always the same duration as during training (in which it terminated in shock), it is not clear whether the subjects actually transferred a specific temporal expectation, or whether their behavior simply reflected a monotonic increase in fear motivation given that shock had not already happened. Indeed, Rescorla (1967) interpreted the data as reflecting the induction of inhibition (of delay) upon CS onset that decayed over time more rapidly than the CS-induced excitation decayed. By this interpretation, one would expect that extending the CS interval would reveal continued growth or maintenance of responding, rather than temporal specificity. More recently, Delamater and Oakeshott (2007) trained rats on a 60-sec variable-interval schedule that two different operant responses (chain pull and lever press) resulted in different rewards (pellet or sucrose solution). Subsequently, they trained rats that two different CSs (each presented for 60 sec) predicted these same rewards at stimulus offset (e.g., noise → pellets, light → sucrose). They then assessed responding of each operant in the presence and absence of these CSs. They found sensory-specific PIT, such that responding for sucrose was elevated when the CS for sucrose was presented, when compared with when the CS for pellets was presented, and vice versa. Like the data from Rescorla (1967) the magnitude of this transfer effect became progressively larger as the duration in the presence of the CS elapsed. Again, these data are consistent with the notion that the temporal expectancy signaled by the CS was transferred in concert with the sensory-specific representation. However, also like Rescorla's report, these data may alternatively imply that the impact of the CSs simply grew as a function of time, due to a temporally nonspecific source, such as frustration or overall hunger (i.e., the influence of the CS is assessed during extinction). Indeed, Crombag et al. (2008) trained mice on a random time 30-sec schedule in the presence of a 120-sec CS (such that four USs were typically delivered during the CS), and then trained them to lever press for the same outcome. While these mice showed enhanced operant responding when presented with the CS at test, this enhancement in responding grew in magnitude over the first 80 sec and then plateaued. As the mice were unlikely to be exposed to CS–US intervals this long in training, the continued increase in rate as a function of time is not obviously consistent with the transfer of a specific temporal expectation, and provides support for the notion that frustration or session-level hunger might continuously build up over time. Delamater and Holland (2008) conducted a largely equivalent set of experiments to Delamater and Oakeshott (2007), but additionally varied the CS–US interval (i.e., 20, 60, or 180 sec) across groups. During the PIT test, they presented the CS for the same duration used in training, and found that responding was maximal at the end of the CS when trained with a 20- or 60-sec CS, although it was flat when trained with a 180-sec CS. As the rate at which responding increased during the CS varied as a function of duration (i.e., it was steeper for the 20-sec condition than the 60-sec condition), these data suggest that sensory-specific PIT can reflect the temporal interval associated with the CS. Indeed, in a further demonstration of this effect, Delamater et al. (2017) trained rats using a 120-sec CS, but delivered the US 20 sec after CS onset. When tested for PIT, responding was maximal in the first 20 sec, and declined over the subsequent 100 sec (see also Delamater et al. 2014). Together, these findings indicate that the temporal dynamics seen in PIT are not the result of simple monotonic increases due to frustration, fear, or general hunger, and instead suggest that PIT effects are modulated by the temporal expectations signaled by the CS. Given these prior findings, an important next step is to identify the form and characteristics of the temporal modulation of PIT. One central feature of interval timing behavior is that it conforms to the scalar property, an expression of Weber's law, such that the variability in temporally specific responding is directly proportional to the interval being timed (Gibbon 1977). For example, in temporal production procedures like the peak-interval task described above, plots of the average rate of responding as a function of time are well described by a Gaussian-shaped curve centered at the fixed interval being timed and with a spread (e.g., the width at half-maximal responding) that is proportional to the interval (Matell and Meck 2000). Indeed, when the x-axis is rescaled so that each bin is a fixed proportion of the interval being timed (e.g., 1-sec bins for a 10-sec fixed interval, 2-sec bins for a 20-sec fixed interval), the curves for the different intervals superimpose, thereby indicating that temporal precision is relative to the timed interval. Therefore, we conducted the present experiments to ascertain whether the temporal relation associated with a Pavlovian contingency would be transferred to the operant response when tested in a Pavlovian-instrumental transfer procedure, and if so, whether the dynamics would reflect the scalar property of interval timing.

Results

Experiment 1—Pavlovian-instrumental transfer with 60-sec fixed-time schedule

We trained rats under a Pavlovian contingency that a conditioned stimulus (CS+, light or tone) probabilistically predicted delivery of a food pellet US 60 sec following onset of the CS+. Occupancy of the food magazine was greater during the 60-sec period during the CS+ relative to a pre-CS baseline period of the same duration, as well as in comparison to the CS− (a signal of the other modality than the CS+, and presented for the same duration as the CS+ but not followed by the US). A repeated-measures ANOVA comparing mean magazine occupancy as a function of Period (pre-CS, post CS+, post CS−), and CS+ Modality (tone, light) showed a main effect of Period (F(2,36) = 20.21, P < 0.001, partial η2 = 0.53). No other results were significant (all Fs < 1). Paired t-tests confirmed that occupancy was greater during the CS+ than during the pre-CS baseline and during the CS− (both P < 0.001), whereas there was no difference in occupancy during the CS− when compared with the pre-CS baseline. As can be seen in Figure 1, magazine occupancy on probe trials following onset of the CS+ was generally maximal around the time that the US would be delivered. A repeated-measures ANOVA of food magazine occupancy during the CS+ as a function of time revealed a main effect of Time (F(11,198) = 8.47, P < 0.001, partial η2 = 0.32), but no effect of modality (F < 1), nor an interaction (F(11,198) = 1.87). An equivalent ANOVA on CS− responding yielded no significant effects (all Fs < 1).
Figure 1.

Average occupancy in the food magazine as a function of time preceding and following the presentation of CS+ and CS− on extended duration, no US probe trials, sorted by cue modality. Rats were trained with a 60-sec fixed time CS–US interval.

Average occupancy in the food magazine as a function of time preceding and following the presentation of CS+ and CS− on extended duration, no US probe trials, sorted by cue modality. Rats were trained with a 60-sec fixed time CS–US interval. Once temporal control was established under the Pavlovian contingency, rats were trained on an operant variable interval 60-sec schedule until responding was reliably emitted at a low rate. To examine Pavlovian-instrumental transfer, we then presented the CS+ and CS− under extinction conditions while subjects were engaged in the operant behavior. As can be seen in Figure 2, the rate of operant responding during the first 60 sec of the CS+ appeared to be higher than both the pre-CS rate and in the presence of the CS−.
Figure 2.

Average operant response rate during the nonreinforced transfer tests. Mean responding during a 60-sec period preceding or following onset of the CS+ and CS− is presented, sorted by cue modality.

Average operant response rate during the nonreinforced transfer tests. Mean responding during a 60-sec period preceding or following onset of the CS+ and CS− is presented, sorted by cue modality. These response rates were analyzed using a repeated-measures ANOVA in which Period (prerate, CS+, CS−) was a within-subjects factor, and CS+ Modality (light, tone) was a between-subjects factor. There was a main effect of Period (F(2,36) = 40.10, P < 0.001, partial η2 = 0.69), but no effect of modality (F < 1), nor an interaction (F(1,18) = 1.87). Paired t-tests confirmed that the rate during the CS+ was higher than the rate during both the baseline and the CS− (both P < 0.001). As can be seen in Figure 3, the CS+ induced increase in responding was not static, but varied as a function of time. However, the gradient of responding appeared to vary as a function of the CS+ modality.
Figure 3.

The temporal pattern of operant responding on nonreinforced transfer tests, prior to and in the presence of an extended duration CS+ and CS−, sorted by modality.

The temporal pattern of operant responding on nonreinforced transfer tests, prior to and in the presence of an extended duration CS+ and CS−, sorted by modality. A repeated-measures ANOVA on normalized responding during the CS+ confirmed a main effect of Time (F(11,198) = 4.16, P < 0.001, partial η2 = 0.19), a main effect of modality (F(1,18) = 12.23, P < 0.005, partial η2 = 0.41), and a time × modality interaction (F(11,198) = 2.14, P < 0.05, partial η2 = 0.11). Separate repeated-measures ANOVAs with each modality revealed an effect of time in the Light CS+ group (F(11,99) = 4.12, P < 0.001, partial η2 = 0.31), but only a trend in the Tone CS+ group (F(11,99) = 1.78, P = 0.068). Indeed, trend analyses examining the shape of the effect over time revealed both a linear and a quadratic effect to the response function of the light group (Ps < 0.005), but only a linear effect in the tone group (P < 0.05). To further examine the temporal pattern of responding, we used a change point detection algorithm developed by Gallistel et al. (2004), to identify the times at which response rates changed for individual rats on individual trials. We elected to use this algorithm, rather than the traditional single-trial algorithm (Cheng and Westwood 1993; Church et al. 1994), as we did not want to make assumptions regarding the form of single-trial responding. There was a mean of 8.1 (SD = 3.2) trials (out of 16) in which there was at least one transition to a higher response rate during the trial, and a mean of 6.3 (SD = 2.8) trials in which there was at least one transition to a lower response rate. The remaining trials had either no responses, or the rate and pattern of responding did not sufficiently differ from the pretrial rate and pattern to identify rate transition times. Because the number of trials with evidence for rate changes is low, we encourage caution in interpretation. On those trials in which a rate change was detected, there was a mean of 1.8 (SD = 0.5) transitions to a higher rate, and a mean of 1.5 (SD = 0.5) transitions to a lower rate. We used the first increase in rate and the first decrease in rate as a putative “start and stop times,” as has been done previously (Taylor et al. 2007; Balci et al. 2009). Supplemental Figure S1 displays the distributions of these rate transition times across all rats, split by modality. As can be seen, the distributions of rate increases (“start times” and decreases (“stop times” roughly bracket the mean response functions shown in Figure 3, as expected if the potentiated responding was driven by an interval timer. The mean time of a transition to a faster rate was 44.7 sec (SD = 17.6), with a mean within-rat standard deviation (i.e., variability in start times) of 34.1 sec (SD = 10.9). The mean time of a transition to a lower rate was 67.7 sec (SD = 22.2), with a mean within-rat standard deviation (i.e., variability in stop times) of 39.4 sec (SD = 16.0).

Experiment 2—Pavlovian-instrumental transfer with 60-sec random-time schedule

The results of Experiment 1 suggest that the temporal relationship established between the CS and US during training directly moderates the dynamics of operant responding upon CS presentation. However, before we can draw this conclusion, it is necessary to rule out the possibility that these temporal patterns of activity are simply the typical gradients of increased responding resulting from presentation of a visual or auditory conditioned stimulus of equivalent associative strength during extinction conditions. In Experiment 2, we trained another group of rats using the same cues and equivalent average reward rate, but in which the CS+ provided poor temporal predictability of US delivery by training the Pavlovian contingency with a random time 60-sec schedule. As expected, occupancy of the food magazine on probe trials was greater during the 60-sec period during the CS+ relative to a pre-CS baseline period of the same duration, as well as in comparison to the CS−. A repeated-measures ANOVA comparing mean magazine occupancy as a function of Period (pre-CS, post CS+, post CS−), and CS+ modality (light, tone) showed a main effects of Period (F(2,16) = 7.18, P < 0.01, partial η2 = 0.47). No other results were significant (Fs < 1). Paired t-tests confirmed that occupancy was greater during the CS+ than during the pre-CS baseline and during the CS− (both P < 0.05), whereas there was no difference in occupancy during the CS− when compared with the pre-CS baseline. As can be seen in Figure 4, magazine occupancy on probe trials increased following onset of the CS+. However, unlike rats trained with the fixed time schedule in Experiment 1, training with a random time 60-sec schedule resulted in magazine checking behavior that did not appear to show systematic variation as a function of time. Indeed, a repeated-measures ANOVA of food magazine occupancy during the CS+ as a function of time revealed only a trend for an effect of Time (F(11,88) = 1.75, P = 0.075), no effect of modality (F < 1), and no interaction (F(11,88) = 1.22). For completeness, an equivalent ANOVA on food magazine occupancy during the CS− revealed no effect of Time (F < 1), no effect of Modality (F(1,8) = 3.02), and no interaction (F < 1).
Figure 4.

Average occupancy in the food magazine as a function of time preceding and following the presentation of CS+ and CS− on extended duration, no-US probe trials, sorted by cue modality. Rats were trained with a 60-sec random time CS–US interval.

Average occupancy in the food magazine as a function of time preceding and following the presentation of CS+ and CS− on extended duration, no-US probe trials, sorted by cue modality. Rats were trained with a 60-sec random time CS–US interval. Following operant training, Pavlovian-instrumental transfer was investigated. Mean response rates during the first 60 sec of CS+ and CS− presentations (i.e., up to the time that the US would have occurred on average), as well as the 60-sec period prior to CS onset were calculated. As can be seen in Figure 5, the rate of responding during the test session appeared to be higher in the presence of the CS+ relative to the pre-CS rate and the CS− rate. There also appeared to be greater responding to the tone than the light, across all periods.
Figure 5.

Average operant response rate during the nonreinforced transfer tests. Mean responding during a 60-sec period preceding or following onset of the CS+ and CS− is presented, sorted by cue modality.

Average operant response rate during the nonreinforced transfer tests. Mean responding during a 60-sec period preceding or following onset of the CS+ and CS− is presented, sorted by cue modality. A repeated-measures ANOVA supported these impressions. There was a main effect of Period (pre-CS, CS+, CS−) on response rates (F(2,16) = 7.38, P < 0.005, partial η2 = 0. 48). There was also an effect of CS+ modality (F(1,8) = 5.90, P < 0.05, partial η2 = 0.42), as responding was greater across all periods in the rats trained with a tone as the CS+. There was no Modality × Period interaction (F(2,16) = 1.52). Paired t-tests comparing response rates across periods indicated that the CS+ rate was greater than the pre-CS rate (P < 0.05), as well as the CS− rate (P < 0.005), but there was no significant difference between the CS− rate and the base rate. As can be seen in Figure 6, the CS+ induced increase in responding appeared to be relatively static, rather than changing over time in a systematic manner. Indeed, a repeated-measures ANOVA on normalized responding during the CS+ revealed only a trend for an effect of time (F(11,88) = 1.87, P = 0.055, partial η2 = 0.19), no effect of modality (F(1,8) = 1.03), and no interaction (F < 1).
Figure 6.

The temporal distribution of operant responding on nonreinforced transfer tests, prior to and in the presence of an extended duration CS+ and CS−, sorted by modality.

The temporal distribution of operant responding on nonreinforced transfer tests, prior to and in the presence of an extended duration CS+ and CS−, sorted by modality. As in Experiment 1, we identified the time of rate changes on individual trials. There was a mean of 4.6 (SD = 2.2) trials (out of 16) in which there was at least one transition to a higher response rate during the trial, and a mean of 2.5 (SD = 1.8) trials in which there was at least one transition to a lower response rate. On those trials in which a rate change was detected, there was a mean of 1.4 (SD = 0.5) transitions to a higher rate, and a mean of 1.4 (SD = 0.5) transitions to a lower rate. Supplemental Figure S2 displays the distributions of these rate transition times across all rats, split by modality. As can be seen, the distributions are much flatter in this experiment than they were in Experiment 1. The mean time of a transition to a faster rate was 84.1 sec (SD = 21.5), with a mean within-rat standard deviation of 52.5 sec (SD = 15.8). The mean time of a transition to a lower rate was 69.4 sec (SD = 28.4), with a mean within-rat standard deviation of 38.9 sec (SD = 23.9). As the purpose of this experiment was to assess whether the temporal pattern of responding to the CS+ was a result of the temporal predictability of the US in the Pavlovian phase, we also compared the normalized CS+ induced operant rates in the current experiment (random time 60-sec schedule) with those from the rats from Experiment 1 (fixed time 60-sec schedule). A repeated-measures ANOVA with Time as a within subject factor, and Modality and Schedule as between subject factors revealed a main effect of Time (F(11,286) = 2.25, P < 0.05, partial η2 = 0.08) and a main effect of Schedule (F(1,26) = 17.86, P < 0.001, partial η2 = 0.41). There was no main effect of Modality (F < 1), but there was an interaction of Modality with Schedule (F(1,26) = 7.87, P < 0.01, partial η2 = 0.23), due to the fact that the light CS+ generated a larger increase than the tone CS+ in Experiment 1 (P < 0.005), whereas there was no significant modality difference in PIT in Experiment 2. Most important, the ANOVA indicated a Time × Schedule interaction (F(11,286) = 2.53, P < 0.005, partial η2 = 0.09), due to the fact that the CS+ induced responding was temporally varying when trained with a fixed time schedule (peak shaped with the light CS+ and monotonically decreasing with the tone CS+), but flat when trained with a random time schedule. Finally, an ANOVA comparing the mean time and breadth of single-trial rate increases and decreases across experiments, revealed a main effect of Schedule for the time of rate increases (F(1,28) = 26.84, P < 0.001), but no effect of Modality nor an interaction (Fs < 1). The mean time at which a rate increase occurred was earlier for the rats trained with an FT60-sec schedule (44.7 sec) than those trained with a RT60-sec schedule (84.1 sec). A comparison of the breadth of the distributions of rate increases across trials revealed a mean effect of Schedule (F(1,28) = 14.08, P < 0.001), but no effect of Modality (F < 1), and no interaction (F(1,28) = 2.23). The width of the distribution of rate increases was narrower in rats trained with the FT60-sec schedule (34.1 sec) than those trained with the RT60-sec schedule (52.5 sec). For the time of rate decreases, there was no effect of Schedule (F < 1), no effect of Modality F(1,27) = 1.15), and no interaction (F(1,26) = 2.63). A comparison of the breadth of the distribution of rate decreases across trials revealed no effect of Schedule (F < 1), but an effect of Modality (F(1,26) = 4.73, P < 0.05), and an interaction (F(1,27) = 9.00, P < 0.01). Probing each modality separately revealed a trend for an effect in Tone CS+ rats (F(1,13) = 3.90, P = 0.072), as the within-rat distribution width was broader in FT60-sec rats than it was in RT60-sec rats. In contrast, with a light CS+, the within-rat distribution width of the first rate decrease was narrower in the FT60-sec rats than in RT60-sec rats (F(1,12) = 5.15, P < 0.05). Together, these experiments suggest that the pattern of visual CS+ induced responding seen in Experiment 1 was not due to presentation of an appetitive conditioned stimulus under extinction conditions. Rather, these data indicate that the temporally specific potentiation of operant responding is reflective of the temporal relationship that was established between the CS and US during Pavlovian training.

Experiment 3—Pavlovian-instrumental transfer with 120-sec fixed-time schedule

As described in the Introduction, perception and action with respect to time in the seconds-to-minutes range conforms to Weber's law, and is referred to as the scalar property (Gibbon 1977). Thus, if the temporal information acquired during Pavlovian training modulates the strength of operant responding, then the time and breadth of increased responding should vary in proportion to the CS–US interval. To evaluate this hypothesis, we conducted a third experiment in which we doubled the temporal interval between CS+ onset and US delivery (i.e., a 120-sec fixed time interval between CS and US). The reward rate was maintained across experiments by also doubling the US amount. In addition, we assessed the impact of Pavlovian extinction on temporal transfer by giving only half the rats extinction training as used in the previous experiments. Extinction training has been used by some laboratories to minimize CS+ induced magazine approach that can interfere with operant responding (Holmes et al. 2010), although others have reported no effects of Pavlovian extinction on PIT (Delamater 1996). Irrespective of these conflicting reports on the magnitude of PIT, extinction could also alter the temporal expectancy of the US (Drew et al. 2017), and therefore weaken temporally specific transfer. At the end of Pavlovian training, occupancy of the food magazine was greater during the 120-sec period following the CS+ relative to a pre-CS baseline period of the same duration, as well as in comparison to the CS−. Repeated-measures ANOVA comparing mean magazine occupancy as a function of Period (pre-CS, post CS+, post CS−), and CS+ Modality (light, tone) showed main effects of Period (F(2,36) = 22.66, P < 0.001, partial η2 = 0.56). No other results were significant (Fs < 1). Paired t-tests confirmed that occupancy was greater during the CS+ than during the pre-CS baseline and during the CS− (both P < 0.001), whereas there was no difference in occupancy during the CS− when compared with the pre-CS baseline. As can be seen in Figure 7, magazine occupancy on probe trials following onset of the CS+ was maximal around the time that the US would be delivered when the CS+ was a light. In contrast, magazine occupancy was maximal in the first bin following tone CS+ onset, and declined back toward baseline around the time at which the US would have been delivered.
Figure 7.

Average occupancy in the food magazine as a function of time preceding and following the presentation of CS+ and CS− on extended duration, no-US probe trials, sorted by cue modality. Rats were trained with a 120 fixed time CS–US interval.

Average occupancy in the food magazine as a function of time preceding and following the presentation of CS+ and CS− on extended duration, no-US probe trials, sorted by cue modality. Rats were trained with a 120 fixed time CS–US interval. A repeated-measures ANOVA on food magazine occupancy as a function of time during the CS+ revealed a main effect of Time (F(11,198) = 3.08, P < 0.05, partial η2 = 0.15), but no effect of modality (F < 1). The interaction between time and modality failed to reach significance (F(11,198) = 2.21, P = 0.096). An equivalent ANOVA on CS− responding indicated no effect of Time (F < 1), no effect of Modality (F(1,18) = 2.19), and no interaction (F < 1). To assess whether or not Pavlovian-instrumental transfer occurred, mean response rates during the first 120 sec of CS+ and CS− trials (i.e., up to the time that the US would have occurred), as well as the 120-sec period prior to CS onset were calculated. As can be seen in Figure 8, the rate of responding during the test session appeared to be higher in the presence of the CS+ relative to both the pre-CS rate and the CS− rate.
Figure 8.

Average operant response rate during the nonreinforced transfer tests. Mean responding during a 120-sec period preceding or following onset of the CS+ and CS− is presented, sorted by cue modality and whether the rats received extinction sessions with the CSs.

Average operant response rate during the nonreinforced transfer tests. Mean responding during a 120-sec period preceding or following onset of the CS+ and CS− is presented, sorted by cue modality and whether the rats received extinction sessions with the CSs. In addition, extinction training appeared to lower CS+ induced rates. A repeated-measures ANOVA supported these impressions. There was a main effect of Period (pre-CS, CS+, CS−) on response rates (F(2,32) = 21.01, P < 0.001, partial η2 = 0.57). In addition, there was a Period by Extinction interaction (F(2,32) = 5.28, P < 0.05, partial η2 = 0.25). No other effects were significant: Modality (F(1,16) = 1.08), Extinction (F(1,16) = 2.87), Modality × Extinction (F < 1); Period × Modality (F < 1), Period × Modality × Extinction (F < 1). Paired t-tests comparing response rate across Periods indicated that the CS+ rate was greater than the pre-CS rate (P < 0.001), as well as the CS− rate (P < 0.001), whereas there was no significant difference between the CS− rate and the base rate. As we conducted extinction training in half the rats to decrease the potential for magazine approach responding that might contaminate temporal transfer, we sought to evaluate the effectiveness of this manipulation. Surprisingly, magazine occupancy during test did not vary significantly as a function of CS presentation (F(2,32) = 1.44), Extinction (F(1,16) = 1.98), nor as a function of Modality (F < 1), and no interactions were significant (all Fs < 1.36). Only one session of transfer data was collected prior to an experimenter error that excessively shortened the inter-trial interval. Therefore, responses from this experiment were binned in 60-sec bins to minimize the impact of noise in presentation and analysis. Figure 9 displays the normalized response rate as a function of time prior to, and during, presentation of the CS+.
Figure 9.

The temporal distribution of operant responding on nonreinforced transfer tests, prior to and in the presence of an extended duration CS+ and CS−, sorted by modality and whether the rats received extinction sessions with the CSs.

The temporal distribution of operant responding on nonreinforced transfer tests, prior to and in the presence of an extended duration CS+ and CS−, sorted by modality and whether the rats received extinction sessions with the CSs. There appeared to be an increase followed by a decrease in responding as a function of time in the CS+. This appears clearer in the groups that were not given extinction training. A repeated-measures ANOVA on normalized responding during the CS+ revealed an effect of Time (F(5,80) = 3.62, P < 0.05, partial η2 = 0.19), and an effect of Extinction (F(1,16) = 25.0, P < 0.001, partial η2 = 0.61). No other effects were significant: Modality (F(1,16) < 1), Modality × Extinction, (F(1,16) = 2.46), Time × Modality (F < 1), Time × Extinction (F < 1), Time × Modality × Extinction (F < 1). We identified the time of rate changes on individual trials. Due to the single session, there were only four trials of possible data, with fewer trials in which subjects responded. In the extinguished group, there was a mean of 1.2 (SD = 1.0) trials with at least one transition to a higher rate, and a mean of 1.0 (SD = 0.8) trials with at least one transition to a lower rate. In the nonextinguished group, there was a mean of 3.0 (SD = 0.9) trials with at least one transition to a higher rate, and a mean of 2.1 (SD = 0.9) trials with at least one transition to a lower rate. As we needed at least two trials with transitions to compute spreads, we restricted our analysis to rats from the nonextinguished group that met this criterion (n = 9 for up transitions and 7 for down transitions). The distribution of rate transitions is shown in Supplemental Figure S3. The mean time of a transition to a faster rate was 62.7 sec (SD = 31.4), with a mean within-rat standard deviation of 49.6.4 sec (SD = 26.4). The mean time of a transition to a lower rate was 119.3 sec (SD = 67.5) with a mean within-rat standard deviation of 62.2 sec (SD = 32.0). In order to compare the time and breadth of CS+ potentiated responding from Experiments 1 and 3, we plot in Figure 10 the data from the two experiments in absolute time bins (top panels) and in bins proportional to the fixed interval (lower panels).
Figure 10.

Peak functions from Experiments 1 and 3 (nonextinguished group), plotted in both absolute (top panels) 15 sec wide time bins and with bins whose widths were proportional (bottom panels) to the fixed interval from that experiment (i.e., 15 and 30 sec). The left panels are from rats in which the light was the CS+ and the right panels are from rats in which the tone was the CS+. Data from Experiment 3 have been smoothed with a running mean for presentation to minimize noise due to the single session data set.

Peak functions from Experiments 1 and 3 (nonextinguished group), plotted in both absolute (top panels) 15 sec wide time bins and with bins whose widths were proportional (bottom panels) to the fixed interval from that experiment (i.e., 15 and 30 sec). The left panels are from rats in which the light was the CS+ and the right panels are from rats in which the tone was the CS+. Data from Experiment 3 have been smoothed with a running mean for presentation to minimize noise due to the single session data set. As can be seen, when plotted in absolute bins, responding was later and broader in the 120-sec groups compared with the 60-sec groups, again more clearly for the visual cue than the auditory cue. In contrast, when plotted in bins that were proportional to the fixed interval (i.e., 15-sec bins for the 60-sec group and 30-sec bins for the 120-sec group), the data showed approximate superimposition, thereby suggesting that PIT is scalar. To confirm these observations, we used a repeated-measures ANOVA to compare the normalized patterns of PIT in Experiment 1 to the nonextinction group of Experiment 3, using both absolute 15-sec bins over the 180 sec of stimulus time the two experiments had in common and using an equivalent number of proportional bins over the entire response interval (bin widths of 15 and 30 sec, for the 60 and 120-sec FT, respectively). With absolute bins of 15 sec, there was a main effect of Schedule (F(1,26) = 11.75, P < 0.005), an interaction between Time and Modality (F(11,26) = 2.87, P < 0.001), and the three-way interaction between Time, Modality and Schedule (F(11,286) = 1.92, P < 0.05). We therefore compared each modality separately. For the rats trained with a light CS+, there was a main effect of Time (F(11,143) = 2.48, P < 0.01), a main effect of Schedule (F(1,13) = 9.45, P < 0.01), and critically a Time × Schedule interaction (F(11,143) = 2.16, P < 0.05), thereby demonstrating that the pattern of enhanced responding differed in time as a function of the schedule (FT60 sec versus FT120 sec). In contrast, for rats trained with the tone CS+, there was a main effect of Time (F(11,143) = 1.89, P < 0.05), but no effect of Schedule (F(1,13) = 2.49), nor an interaction (F(11,143) = 1.18). Conversely, when responding was binned proportionally, there was a main effect of Time (F(11,286) = 4.22, P < 0.001), and a Time × Modality interaction (F(11,286) = 2.48, P < 0.01), with no other effects reaching significance. We also compared the means and spreads of the distributions of times at which the response rates transitioned to a higher rate from the single-trial change point analyses. An ANOVA with factors of Schedule and Modality revealed a main effect of Schedule (F(1,28) = 4.54, P < 0.05), a main effect of Modality (F(1,28) = 11.73, P < 0.005), as well as their interaction (F(1,28) = 14.58, P < 0.001). Probing the modalities separately, revealed no difference in the time of a rate increase in the rats trained with a Tone CS+, but a significantly (P < 0.001) later time of a rate increase in light CS+ rats trained with a 120-sec FT schedule (85.6 sec) when compared with light CS+ rats trained with a 60-sec FT schedule (43.3 sec). An ANOVA on the within-rat spread of rate increases revealed an effect of Schedule (F(1,28) = 4.67, P < 0.05) as the within-rat spread of rate increases with the FT120-sec schedule (49.6 sec) was larger than that with the FT60-sec schedule(34.1 sec). An ANVOA conducted on the time of rate decreases revealed a significantly later time of a rate decrease in rats trained with the FT120-sec schedule (119.3 sec) when compared with those trained with the FT60-sec schedule (67.7 sec) (F(1,26) = 9.79, P < 0.005), but no effect of Modality (F < 1), and no interaction (F(1,26) = 1.18). Likewise, a comparison of the within-rat spread of rate decreases revealed an effect of Schedule (F(1,26) = 7.88, P < 0.01), as the breadth was wider in the FT120-sec rats (62.2 sec) than the FT60-sec rats (39.4 sec). There was no effect of Modality (F(1,26) = 1.92), and a strong trend for an interaction (F(1,26) = 4.16, P = 0.053).

Discussion

The presentation of a conditioned stimulus (CS+) that predicts an unconditioned stimulus (US), can amplify independently trained operant responding, a phenomenon known as Pavlovian-to-instrumental transfer (PIT). Despite the considerable work investigating the behavioral and neurobiological mechanisms of PIT (Cartoni et al. 2016 lists over 100 published reports since 2000), examination of the temporal features of PIT has been relatively minimal (Rescorla 1967; Delamater and Holland 2008; Delamater et al. 2014, 2017). Further, while these reports have demonstrated differential temporal modulation of responding during the CS+, suggesting that potentiated responding reflects the interval learned in the Pavlovian phase, there has been no work showing that the form of mean responding on extended duration probe trials is peak shaped, as seen with other investigations of interval timing using temporal production procedures (Roberts 1981; Church et al. 1994; Rakitin et al. 1998; Matell and Meck 2000; Balci et al. 2009; Drew et al. 2017). Likewise, evaluating whether the temporal dynamics of PIT are scalar has not been previously done. To address these limitations, in Experiments 1 and 3, we trained rats with a Pavlovian contingency that a conditioned stimulus (CS+, light or tone) probabilistically predicted delivery of a food pellet US at a specific duration following onset of the CS+. On extended duration probe trials conducted during Pavlovian training, occupancy in the food magazine was increased by the CS+ relative to the CS− and when compared with baseline occupancy. Further, the CS+ induced occupancy varied as a function of time, with maximal occupancy occurring around the time at which the US was typically delivered, as seen previously (Kirkpatrick and Church 2000a,b; Drew et al. 2005; Balsam et al. 2006). After temporally controlled Pavlovian conditioned approach was obtained, rats were trained in an operant task that nosepoking would be reinforced on a lean variable interval schedule. During test sessions conducted in extinction, presentation of the CS+ increased the rate of operant responding, whereas presentation of the CS− did not, thereby demonstrating PIT. Importantly, the amplification of responding was not static, but varied as a function of time. For the groups trained with a visual CS+, maximal amplification occurred around the time that the US was expected. In contrast, in the groups trained with an auditory CS+, the pattern of increased responding did not peak at the trained CS–US interval, but typically was maximal at CS onset, although the rate of decline in responding appeared to reflect the CS–US interval. Indeed, a comparison of the response patterns in Experiments 1 and 3, in which subjects were trained with different CS–US intervals (60 and 120 sec, respectively), revealed differences in the absolute response pattern of responding over time as a function of schedule, but revealed no difference in the temporal pattern of responding when normalized by the CS–US interval (i.e., the response functions superimposed), suggesting that PIT, like other forms of interval timing behavior (Buhusi and Meck 2005), is scalar. As mentioned above, the mean response pattern on probe trials of temporal production procedures is well characterized by a Gaussian-shaped peak function centered at the fixed interval used during training. However, this smooth peak function is an artifact of averaging across trials, as single-trial analyses demonstrate a step-like pattern of responding that varies in location and duration across trials (Cheng and Westwood 1993; Church et al. 1994; Matell et al. 2006; Taylor et al. 2007; Balci et al. 2009). Specifically, subjects abruptly switch from a low baseline rate of responding to a high rate of responding at roughly 50% of the fixed interval, and then return to the low baseline rate of responding after ∼150% of the fixed interval. Due to variation in both the start and stop times of responding, this abrupt step pattern is smeared, resulting in the smooth peak-shaped mean function. We applied a rate-change detection algorithm developed by Gallistel et al. (2004) which detects changes in the inter-response interval on single trials, under the assumption that responding reflected a random rate (Poisson) process. While limited by the low numbers of trials and low rate of responding, this analysis provided evidence that the smooth mean functions presented here are likewise the result of abrupt changes in responding on individual trials. Thus, these data further support the notion that PIT is moderated by the output of an interval timing process. Many of the past demonstrations of temporally modulated behavior in PIT tests have been increasing monotonic response trends that could be attributed to temporally nonspecific growth of states such as the development of frustration or hunger due to testing in extinction rather than a motivational state that fluctuates in direct relation to the trained interval (although see Delamater et al. 2014, 2017 for evidence of temporal specificity in sensory-specific PIT). Similar nontiming interpretations could potentially be used to explain the data from Experiments 1 and 3. Therefore, in Experiment 2, we explicitly assessed whether the temporal pattern of responding during transfer testing was a reflection of the temporal predictability established during Pavlovian training. We trained rats on a random time 60-sec schedule, which led to the CS+ being a poor predictor of the time of US delivery. We found that although presentation of the CS+ resulted in an increase in responding, this increase in response rate did not systematically fluctuate as a function of time. As such, these data suggest that the systematic patterns seen here are not a direct consequence of the presentation of appetitively conditioned stimuli under extinction conditions. Taken together, our findings showing scalar, peak-shaped, temporally controlled mean functions resulting from abrupt single trial patterns of responding, are strong evidence that PIT reflects the temporal information learned during the Pavlovian phase. As such, the current results support the notion that the motivational surge and/or outcome expectancy hypothesized to be induced by the presentation of a CS+ is not static, but fluctuates in strength in a manner reflecting previously learned temporal relationships.

Type of PIT

There are different forms of PIT, and the particular type seen has been suggested to depend on the number of different operant responses and rewards used in the operant phase (Holland 2004; Cartoni et al. 2016). Specifically, when a single reinforcer is used in the operant phase, the potentiation that is obtained is referred to as “general” PIT as increased activity is seen irrespective of whether the reinforcer used in the operant phase is the same as, or different from, the US used in the Pavlovian phase. For example, Holland trained rats using two different CSs associated with two different outcomes in the Pavlovian phase, and then trained them with a single outcome in the operant phase. At test, both CSs (i.e., the one whose outcome matched the operant outcome, and the one whose outcome did not match the operant outcome) generated an equivalent degree of response potentiation, thereby indicating that the PIT was general, rather than outcome specific. General PIT has classically been interpreted as resulting from activation of an emotional or motivational process that energizes responding (Konorski 1967; Rescorla and Solomon 1967; Bindra 1974). In contrast, in studies in which multiple reinforcers are used in the operant phase, a “sensory-specific” PIT has been found (Trapold and Overmier 1972), in which the CS+ that predicts the same outcome as that associated with the current operant produces a much greater degree of enhancement than if the two outcomes do not match (Kruse et al. 1983; Rescorla 1994; Delamater and Oakeshott 2007). This sensory-specific PIT has been interpreted as resulting from CS+ induced activation of the expected sensory qualities of the outcome, which then facilitate the action associated with this outcome (Balleine and Dickinson 1998). The current procedure used only a single reinforcer for both the Pavlovian and operant phases, and therefore we cannot conclusively identify whether the CS+ induced potentiation of responding is due to a general arousing process, a sensory-specific expectancy process, or both. However, in addition to these behavioral dissociations, general and sensory-specific PIT have also been shown to be mediated by different neural mechanisms. The central nucleus of the amygdala and the core of the nucleus accumbens has been shown to be necessary for general, but not sensory specific PIT, whereas the basolateral amygdala and the shell of the nucleus accumbens is necessary for sensory specific, but not general PIT (Hatfield et al. 1996; Hall et al. 2001; de Borchgrave et al. 2002; Blundell et al. 2003; Holland and Gallagher 2003; Corbit and Balleine 2005). Of direct relevance to the current experiments, in the experiments performed by Hall et al. (2001) and Holland and Gallagher (2003), only a single outcome and single lever were used throughout training, and lesions of the central nucleus of the amygdala and core of the nucleus accumbens eliminated PIT, whereas lesions to the basolateral amygdala and shell of the nucleus accumbens did not. Although these reports suffer from the same issue as the present study regarding the inability to conclude whether the PIT that was disrupted was general, sensory-specific, or both, a subsequent report by Corbit and Balleine (2005) resolved this ambiguity. In their work, rats were trained using three outcomes in the Pavlovian phase, and two outcomes in the operant phase, thereby allowing behavioral dissociations of general and sensory-specific PIT in the same rats. Crucially, lesions of the central nucleus of the amygdala eliminated general PIT, while sparing sensory-specific PIT, whereas lesions of the basolateral amygdala eliminated sensory-specific PIT, while sparing general PIT. A follow-up study used the same procedure and demonstrated that the core of the nucleus accumbens was necessary for general PIT, but not sensory-specific PIT, while the shell of the accumbens was necessary for sensory-specific, but not general PIT (Corbit and Balleine 2011). Applying these findings to the reports by Hall et al. (2001) and Holland and Gallagher (2003) strongly implies that training with a single reinforcer in both Pavlovian and operant phases produces only general, and not sensory-specific PIT. As such, we interpret the present findings as reflecting general, rather than sensory-specific PIT. Nevertheless, given the lack of a behavioral dissociation in the present work, some caution should be taken regarding this conclusion. There are two obvious routes by which general PIT could generate the temporal dynamics found here. One possibility, the multicomponent model of Pavlovian learning suggested by Delamater et al. (2014), is that activation of the CS induces multiple independent processes, including a general motivational process, a hedonic response, a sensory specific representation, a behavioral response process and a representation of temporal information. Applied to the temporal domain, this component model is most consistent with a dedicated structure framework for timing and time perception. For example, structures such as the striatum and cerebellum have been suggested to play a prominent role in interval timing behavior (Ivry 1993; Matell and Meck 2004; Wiener et al. 2010). As such, one would suppose that in the current experiments, the activation of the CS+ induces a general motivational process through the central nucleus of the amygdala and core of the nucleus accumbens, which enhances operant responding in a steady state manner, and separately activates a temporal information processing network (e.g., in the striatum), which then moderates in time the production of the operant response either directly, or through modulation of the activity of these general PIT structures. However, in contrast to a centralized structure account of timing, a distributed temporal processing approach has also been proposed (Ivry and Spencer 2004; Karmarkar and Buonomano 2007). By this account, the time between and across events is such a critical facet of the environment that temporal information processing capabilities are embedded within all neural circuits. Indeed, recent models of interval timing can be implemented with a small number of neurons through incorporation of negative feedback (Simen et al. 2011). Applied to the current experiment, this notion would suggest that the CS+ induced activation is itself dynamic, and that the motivational enhancement underlying the response facilitation ebbs and flows in direct proportion to the dynamic CS+ induced activation (note that inclusion of a threshold for response potentiation would be required to generate the abrupt onset and offset of activity seen here). In other words, temporal information would be transferred through the same neural pathways as the general motivational effect is instantiated. Additional work will be required to differentiate between these possibilities. The idea that temporal relationships are intimately linked to Pavlovian associations (irrespective of their neural instantiation) has been argued previously. For example, Miller and colleagues have proposed that temporal relationships are part of the content of learning (along with the strength of association) and are always encoded during conditioning (Miller and Barnet 1993). In support of this proposition, Arcediano et al. (2003) have demonstrated knowledge of the temporal relationship between events by training rats with two different conditioning procedures that individually elicit minimal responding, but provide predictive information that does elicit responding if temporally integrated. Specifically, thirsty rats were trained with a sensory preconditioning procedure that a 3-sec click train preceded a 3-sec tone with a 5-sec gap in between the offset of the click train and the onset of the tone. Subsequently, the rats were trained with a backward conditioning procedure that a 1-sec foot shock preceded the onset of the tone by 4 sec. When tested for the response to the click train, the rats suppressed their licking for water, as though they were anticipating a shock, despite never having received a shock following the click train. Such an outcome can be understood if the rats integrated the temporal relations between the two phases of training, forming a temporal map of events. As the offset of the click train predicted the tone in 5 sec, and the tone followed the shock after 4 sec, then combining this information leads to the click train predicting the shock in 1 sec, and thus the rats should show fear. In contrast, rats trained with no gap between the termination of the click train and onset of tone showed much less suppression, which again makes sense, as integrating the events would result in the shock preceding the click train. Taking this idea even further, Gallistel and Gibbon (2000) have proposed that such temporal relationships form the basis for associative conditioning, rather than being an independently learned piece of information. In support of this idea, it has been demonstrated that the rate of conditioning is determined by the ratio of the CS–US interval to the US–US interval, rather than the absolute duration of the CS–US interval (Gibbon and Balsam 1981). Furthermore, it has been shown that temporal information is acquired prior to any display of conditioned responding (Ohyama and Mauk 2001), and temporal control can be seen as soon as conditioned responding emerges (Balsam et al. 2002). Finally, temporal relationships have been demonstrated to modulate blocking (Barnet et al. 1993) and overshadowing (Blaisdell et al. 1998), and temporal specificity is seen in the pattern of autoshaped responding (Drew et al. 2004).

Modality effects

We found modality differences in the overall magnitude of PIT in Experiment 1 and in the temporal distribution of the PIT effect in Experiments 1 and 3. The difference in the temporal pattern of responding to the CS+ as a function of modality is reminiscent of prior appetitive conditioning work showing that the form of responding differed between auditory and visual stimuli. Holland (1977) showed that auditory CSs induced head jerk behaviors, whereas visual stimuli elicited rearing and approach to the food magazine. Similar differences in conditioned responding as a function of modality have also been seen in aversive conditioning (Kim et al. 1996). Subsequent work by Holland (1980) demonstrated that these different conditioned response forms further interacted with the CS–US duration. With short tone CS–US durations (i.e., ≤10 sec), head-jerking dominated over magazine approach in a relatively static manner over an extended 60-sec evaluation period, whereas with longer tone CS–US durations (≥30 sec), head-jerking appeared for the first 10 sec, but then was replaced with magazine responding as the interval continued. With a light CS, rearing appeared early in the CS, particularly with long CS–US intervals, and was followed by magazine approach. Intriguingly, while the time of maximal magazine approach appeared to roughly coincide with the time of expected US delivery, this relation was stronger for visual than auditory stimuli, similar to what we found here. Similarly, recent work in our laboratory performed to examine cross-modal interactions in operant timing tasks has also shown asymmetric effects of modality that interact in a complex manner with duration (Matell and Swanton 2009; Matell and Kurti 2014). Specifically, when a tone cue signals a short duration (e.g., 8 sec), and a light signals a long duration (i.e., 24 sec), presentation of the tone + light compound results in scalar temporal responding at an intermediate duration, as though the rat was timing an averaged expectation of the two durations, albeit with a bias toward the light-signaled duration. In contrast, if the modality-duration relation was reversed (i.e., light-8 sec, tone-24 sec), rats responded to the tone + light compound in a nonscalar, heavily skewed manner, but again with responding concentrated around the light-signaled duration. Together, these data suggest that visual cues can allow for stronger temporal control of behavior than auditory cues, and that the consequences of such time–modality interactions may produce both qualitative (i.e., whether PIT produces temporally specific potentiation or not and whether compounding generates averaging or selection behavior) and quantitative (i.e., differential magnitude of PIT) differences in behavior. One explanation for visual stimuli providing better temporal control than auditory stimuli, both here and in our previous compounding studies, is an inherent bias in rats to associate visual cues with appetitive outcomes and auditory cues with aversive outcomes. Weiss et al. (1993) trained rats that tone + light compounds predicted either appetitive or aversive outcomes. When tested with the single cues, they found greater responding to the visual cue when it had been trained with an appetitive outcome, but greater responding to the auditory cue when it had been trained with an aversive outcome. An alternative, but not mutually exclusive, explanation of the modality differences in timing, may relate to the intensity of the auditory stimulus used (95 dB). The use of such a loud stimulus may have generated an abrupt increase in attention at stimulus onset that masked or interacted with the temporal expectation. Thus, it will be beneficial to examine temporal modulation in PIT with less intense auditory signals in future studies. We note, however, that the visual biases in our compounding work (Matell and Swanton 2009; Matell and Kurti 2014) were not accompanied by greater variability or unusually shaped responding on auditory alone trials, suggesting that the high auditory intensity did not disrupt normal timing processes. Furthermore, nonpublished work from our laboratory examined the impact of manipulations of the auditory intensity on the compounding biases, but found no effects.

Extinction effects

Experiment 3 showed that extinguishing the Pavlovian relationship prior to test weakens the transfer effect. One possibility is that extinction of the Pavlovian conditioned response generalized or transferred to the operant response, as the behavioral topographies were very similar (i.e., the operant response was a nosepoke into a round aperture at the back of the chamber and the recorded Pavlovian conditioned response was insertion of the head into the rectangular aperture of the food magazine at the front of the chamber). Although such generalization may have contributed to the sensitivity of PIT to extinction, it is unlikely to be a sufficient explanation. Indeed, Delamater et al. (2017) reported, like us, that extinction diminished the magnitude of sensory-specific PIT following limited training (i.e., 16 training trials). In that work, the operant responses were lever pressing and chain pulling, which do not bear obvious similarity to the Pavlovian conditioned response of food magazine entry. These recent results were in contrast to prior work from his laboratory (Delamater 1996) which did not show an effect of extinction training on PIT, or even an increase in PIT following extinction training (Holmes et al. 2010), due to the loss of the competing conditioned approach response. As one difference between Delamater's recent and prior work was the number of training trials, they subsequently assessed whether the length of training (ranging from 4 to 64 trials of training) moderated the effect. Surprisingly, they found that extinction diminished the magnitude of sensory-specific transfer equally across all groups. Thus, it appears unlikely that low associative strength, per se, is the primary variable determining whether extinction impacts transfer. Indeed, the present results further weaken this possibility. Specifically, the rats in Experiment 3 received ∼100 CS–US pairings, which should promote strong associative strength. Nonetheless, there was a decremental impact of extinction treatment. The primary remaining difference between Delamater's recent and earlier work was the temporal structure of the CS–US presentation. Specifically, in Delamater et al. (2017), rats were exposed to 120-sec CS cues, with the US being delivered after 10 sec. Likewise, in the present work, the animals were exposed to nonreinforced probe trials that lasted several times the CS–US interval used on reinforced trials. In contrast, in most earlier work, the CS coterminated with the delivery of the US (or the investigators used a variable reinforcement schedule which promoted continuous expectation of reinforcement throughout the CS). Delamater et al. (2017) therefore suggested that the use of extended CS durations/probe trials alters the sensitivity of PIT to extinction. However, why these extended duration/probe trials alter extinction sensitivity remains unclear. One possibility is that the addition of extended duration trials results in the need to learn to inhibit ongoing responding. In operant procedures, it has been shown that learning to stop responding in a temporally controlled manner is separate (and subsequent) to learning to begin responding (Church et al. 1994). For example, Balci et al. (2009) trained rats on a fixed interval schedule and then added probe trials. The initial pattern of responding on these probe trials was a fixed interval scallop prior to the criterion interval, followed by continued high-rate responding for the duration of the probe trials. As experience with the lack of reinforcement on these probe trials develops, subjects learn to terminate their responses after the criterion interval has passed, and thereby produce the commonly seen “peak,” for which the peak procedure is named. Similar behavioral patterns are seen in Pavlovian peak procedures (Kirkpatrick and Church 2000a; Drew et al. 2005; Tam and Bonardi 2012). In contrast, when the CS always terminates with the US, or in cases in which the US is delivered at variable times during the CS, there is little incentive to learn to inhibit conditioned responding, as it is naturally “inhibited” by the generation of consummatory behaviors related to the delivery of the US. Thus, we postulate that the susceptibility of PIT to extinction treatment depends on the prior development of temporally specific inhibition of responding. Specifically, we propose that as a result of the prior development of temporal inhibition, the context dependency of extinction is weakened. It has been repeatedly demonstrated that extinction is context-dependent (for review, see Bouton 2004). Furthermore, Bouton has argued that the calendar time associated with the extinction procedure is also a context, and that the subsequent change in calendar time could explain spontaneous recovery. As applied to the current PIT work, when subjects are trained with extended CS durations (or probe trials), they are effectively given extinction training within the current training context. This is obvious for nonreinforced probe trials, but it is also likely during extended duration CS trials. Specifically, Matell and Meck (1999) have suggested that delivery of reinforcement during an ongoing signal causes the internal clock to reset. Applied to extended CS trials, the delivery of the US would reset the accumulation of time, and as a result, the ongoing CS, which now terminates without reinforcement, can be understood as an extinction trial. In both cases, when the subsequent extinction treatment occurs, this extinction should generalize to the conditions in which inhibition/extinction occurred previously (i.e., during training), and thus the context-dependency would be weaker. Therefore, upon the PIT test, which given the opportunity to respond and the change in calendar time can be viewed as yet another context, the impact of the extinction treatment will be seen. In summary, the current data, along with those from Delamater (Delamater and Holland 2008; Delamater et al. 2017), further support the idea that the temporal relationship between events is a central part of what is learned during conditioning, and indicate that this information can be transferred across different learning systems (Rescorla and Solomon 1967). PIT is a frequently used procedure for examining the motivational states involved in drug seeking behavior (Robinson and Berridge 1993; Everitt et al. 2001; Volkow et al. 2003; Kelley 2004; Hogarth et al. 2013). Indeed, general and sensory-specific PIT are thought to mimic the states of motivation and expectancy that facilitate patterns of habitual and goal-directed drug seeking, respectively (Hogarth et al. 2013). While it is known that a large variety of stimuli influence drug seeking behavior, the influence of temporal expectations in drug seeking behavior has rarely been examined (although see Di Ciano and Everitt 2004). The current data suggest that temporal information, in addition to sensory cues and physical contexts, may be an important feature of the associative mechanisms involved in addiction. As such understanding the mechanisms and functions of temporal expectancy will likely be critical to the development of treatment.

Materials and Methods

Experiment 1

Subjects and apparatus

Subjects were 20 male Sprague-Dawley rats (Rattus norvegicus, Harlan, Indianapolis, IN). Subjects were ∼3 mo old at the start of the experiment, at which time they were placed on a restricted diet to maintain their body weights at 85%–90% of free feed weight, adjusted for growth. Rats have ad libitum access to water in their home cages, and were housed in pairs. Colony room lights were set to a 12 h light–dark cycle. Training and testing occurred in standard operant-conditioning chambers (30.5 × 25.4 × 30.5 cm; Coulbourn Instruments). The bottom of the chambers consisted of a stainless steel grate. The top, left, and right sides of the chambers were composed of aluminum, while the front and back sides were Plexiglas. Three nosepoke apertures with photobeam detector circuits were located on the right wall, along with a seven-tone audio generator set to produce 95 dB tones. The left wall contained a pellet dispenser used to deliver 45 mg reinforcement pellets (Bio-Serv) into a food magazine equipped with a photobeam detector circuit. Also on the left wall was an 11 lux houselight. Stimulus control and data collection were conducted using an operant-conditioning control program (Graphic State, Coulbourn Instruments).

Procedure

The experiment consisted of four phases: Pavlovian training, operant response training, Pavlovian extinction training, and finally a testing phase. All sessions were conducted at the same time of day during the light phase, lasted 2 h, and took place 5 d per week. Pavlovian training (24 sessions). During Pavlovian training, the nosepoke apertures in the operant conditioning chambers were covered by an aluminum sheet. Trials began with the onset of a cue, either a CS+ or CS− (houselight and tone, counterbalanced). The cues lasted 60 sec. If the trial was a CS+ trial, termination of the cue would co-occur with delivery of a reinforcement pellet to the food magazine. On CS− trials, no reinforcement was provided. Following each trial was an inter-trial interval (ITI) lasting six to seven times the length of the CS–US interval. After 8 sessions, probe trials were added, in which the CS+ and CS− were presented for three to four times the CS–US duration (i.e., 180–240 sec) with no US delivery. Probe trials comprised 20% of the CS+ and CS− trials. Operant training (eight sessions). During operant training, the nosepoke apertures were uncovered and a photobeam detector recorded operant responses on the center nosepoke. Each beam break was treated as a discrete response. To register a new response, the rat had to remove its snout from the nosepoke and then re-insert it. The first 2 d of training began with a continuous reinforcement schedule in which a single response on the center nosepoke earned a single reinforcement pellet. After 20 reinforcements had been earned, the schedule was lengthened to a VI5 sec. The delay was increased in a stepwise manner to VI10, VI20, VI40, and VI60 sec once 20 reinforcers had been earned on each delay schedule. The VI-60 sec schedule was then maintained for the remainder of the 2 h session. After responding was consistently occurring, only the VI-60 sec schedule was used. Pavlovian extinction training (five sessions). Rats were given extinction training to diminish approach to the food magazine that could potentially interfere with the temporal dynamics of operant responding (Holmes et al. 2010). During the extinction sessions, the nosepoke apertures were covered by an aluminum sheet. Subjects received five sessions in which CS+ and CS− probe trials were presented without reinforcement. Transfer Testing (four sessions). During transfer testing, the nosepoke apertures were again available. CS+ and CS− probes lasting three to four times the CS–US duration were presented without reinforcement in an ABBABAAB pattern with A and B signifying CS+ and CS−, respectively. The inter-trial interval was six to seven times the CS–US duration.

Analysis

The times of entrance into, and out of, the center nosepoke aperture and the food magazine was recorded with 20 msec accuracy. A significance level of α = 0.05 was used for all analyses, and the Greenhouse–Geiser correction was used in cases where the sphericity assumption was violated. Pavlovian analysis. The proportion of time in which the snout occupied the food magazine as a function of time before and after the onset of the CS on probe trials was computed, using bins that were one-fourth the duration of the CS–US interval. Due to low numbers of probe trials per session (mean = 1.32 ± 0.38), as well as low occupancy rates, we pooled these data over the final 10 Pavlovian training sessions to construct average response functions. To confirm that conditioning occurred, the average response rate in the 60 sec prior to CS onset was compared with the average response rate during the first 60 sec of the CS using a repeated-measures ANOVA with Period (pre-CS, CS+, CS−) as a within subject factor, and CS+ Modality (light, tone) as a between subject factor. To evaluate whether conditioned approach behavior induced by the CS+ varied as a function of time, occupancy in each 15-sec bin following CS+ onset was entered into a repeated-measures ANOVA with Time as a within-subjects factor, and Modality as a between-subjects factor. Transfer analysis. To assess whether or not Pavlovian-instrumental transfer occurred, mean response rates during the first 60 sec of CS+ and CS− presentation (i.e., up to the time that the US would have occurred), as well as the 60-sec period prior to CS onsets were calculated. These rates were entered into a repeated-measures ANOVA with Period (prerate, CS+, CS−) as a within-subjects factor, and CS+ Modality (light, tone) was a between-subjects factor. Temporal pattern of transfer. To assess whether CS+ induced response rates varied as a function of time following CS+ onset, responses during the 180-sec CS+ were placed in bins that were one-fourth the duration of the CS–US interval (i.e., 15 sec). As the CS+ induced change in response rates varied over both sessions (F(3,54) = 5.21, P < 0.05) and trials within a session (F(3,54) = 4.77, P < 0.05), presumably due to the fact that all transfer sessions were run in extinction, we normalized the rates on each trial by dividing the absolute response rate in each bin by the mean response rate (from −180 to 180 sec) for that trial. These normalized rates were then entered into a repeated-measures ANOVA with Time as a within-subject factor and CS+ Modality as a between-subjects factor. Single trial analysis—We examined the times at which the response rate changed using the change point detection algorithm developed by Gallistel et al. (2004), and used for interval timing experiments by multiple laboratories (Taylor et al. 2007; Balci et al. 2009). Briefly, the algorithm moves through the sequence of inter-response intervals on each trial, and evaluates the relative likelihood that the current interval comes from the same distribution as the previous sequence of inter-response intervals. The inter-response interval distributions are assumed to be exponentially distributed (i.e., generated by a Poisson process). The log odds(logit) that a change in rate would be identified was set at 1.3, which corresponds to a P-value of ∼0.05. We began the analysis for each trial with a 360-sec inter-trial interval appended before and after the trial to provide accurate assessment of the baseline (non-CS) rate.

Experiment 2

Experiment 2 was conducted in an identical manner to Experiment 1, with two exceptions. First, Pavlovian conditioning was carried out with a random time (RT) 60-sec schedule, instead of the fixed time 60-sec schedule used in Experiment 1. Cue length for the RT60 sec was programmed as a 5% chance of cue termination every 3 sec, resulting in cues that varied in duration but lasted an average of 60 sec. If the trial was a CS+ trial, termination of the cue would co-occur with delivery of a grain pellet. CS− trials were run with the same RT schedule, but no reinforcement was provided upon CS termination. As in Experiment 1, nonreinforced probe trials lasting 180–240 sec were presented on 20% of trials. Se, only 10 rats were tested, with the CS+ modality counter-balanced.

Experiment 3

Experiment 3 was conducted identically to Experiment 1, with the following exceptions. The CS–US interval was extended to 120 sec. To keep cued reinforcement rate identical, the US was doubled to two 45 mg grain pellets. In addition, to equate temporal informativeness of the CS+, the inter-trial interval was also doubled (Gibbon and Balsam 1981; Balsam et al. 2006). As a result, the number of trials per session was cut in half, and therefore, rats were given Pavlovian training for twice as many sessions (i.e., 48). Half the rats were given Pavlovian extinction after operant training to minimize conditioned approach, whereas the other half of the rats were placed in the chambers for the 2 h session, but no stimuli were presented. Due to experimenter error, the ITI was excessively short on the second test session. Therefore, analysis of PIT is restricted to the first test session.
  60 in total

Review 1.  What makes us tick? Functional and neural mechanisms of interval timing.

Authors:  Catalin V Buhusi; Warren H Meck
Journal:  Nat Rev Neurosci       Date:  2005-10       Impact factor: 34.870

2.  Timing in the absence of clocks: encoding time in neural network states.

Authors:  Uma R Karmarkar; Dean V Buonomano
Journal:  Neuron       Date:  2007-02-01       Impact factor: 17.173

Review 3.  Goal-directed instrumental action: contingency and incentive learning and their cortical substrates.

Authors:  B W Balleine; A Dickinson
Journal:  Neuropharmacology       Date:  1998 Apr-May       Impact factor: 5.250

4.  Temporal encoding as a determinant of blocking.

Authors:  R C Barnet; N J Grahame; R R Miller
Journal:  J Exp Psychol Anim Behav Process       Date:  1993-10

5.  Control of instrumental performance by Pavlovian and instrumental stimuli.

Authors:  R A Rescorla
Journal:  J Exp Psychol Anim Behav Process       Date:  1994-01

6.  Isolation of an internal clock.

Authors:  S Roberts
Journal:  J Exp Psychol Anim Behav Process       Date:  1981-07

7.  The general and outcome-specific forms of Pavlovian-instrumental transfer are differentially mediated by the nucleus accumbens core and shell.

Authors:  Laura H Corbit; Bernard W Balleine
Journal:  J Neurosci       Date:  2011-08-17       Impact factor: 6.167

Review 8.  Appetitive Pavlovian-instrumental Transfer: A review.

Authors:  Emilio Cartoni; Bernard Balleine; Gianluca Baldassarre
Journal:  Neurosci Biobehav Rev       Date:  2016-09-28       Impact factor: 8.989

9.  Reinforcement-induced within-trial resetting of an internal clock.

Authors:  M S Matell; W H Meck
Journal:  Behav Processes       Date:  1999-04       Impact factor: 1.777

10.  Conditioned reinforcing properties of stimuli paired with self-administered cocaine, heroin or sucrose: implications for the persistence of addictive behaviour.

Authors:  Patricia Di Ciano; Barry J Everitt
Journal:  Neuropharmacology       Date:  2004       Impact factor: 5.250

View more
  1 in total

1.  Neuroplastic Changes in the Superior Colliculus and Hippocampus in Self-rewarding Paradigm: Importance of Visual Cues.

Authors:  Sanjay N Awathale; Akash M Waghade; Harish M Kawade; Gouri Jadhav; Amit G Choudhary; Sneha Sagarkar; Amul J Sakharkar; Nishikant K Subhedar; Dadasaheb M Kokare
Journal:  Mol Neurobiol       Date:  2021-11-19       Impact factor: 5.590

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.