Literature DB >> 24737917

Acquisition of conditioned responding in a multiple schedule depends on the reinforcement's temporal contingency with each stimulus.

Lorenzo Morè1, Greg Jensen.   

Abstract

Forty mice acquired conditioned responses to stimuli presented in a multiple schedule with variable inter-trial intervals (ITIs). In some trials, reinforcement was preceded by a variable conditioned stimulus (CS), while other trials were reinforced following distinctive fixed-duration CS. A third stimulus was presented but never paired with reinforcement. Subjects in five groups experienced ITIs of different durations. Acquisition of responding to each stimulus depended only on the cycle-to-trial ratio (C/T), and thus on the temporal contingency of each stimulus. Acquisition was unaffected by whether CSs were of fixed or variable duration.

Entities:  

Mesh:

Year:  2014        PMID: 24737917      PMCID: PMC3994502          DOI: 10.1101/lm.034231.113

Source DB:  PubMed          Journal:  Learn Mem        ISSN: 1072-0502            Impact factor:   2.460


Pavlovian or Classical conditioning arises when predictive relationships between sets of stimuli are learned by organisms. The unconditioned stimulus (US) leads to an unconditioned response (UR). When an additional stimulus predicts US onset, it may become a conditioned stimulus (CS) and elicit a corresponding conditioned response (CR) (Pavlov 1927). The implications of Pavlovian conditioning have been contested from the outset (Windholz 1986) and remain contentious. Many published models build on the associative framework set forward by Rescorla and Wagner (1972). Although such models can explain many conditioning phenomena, most cannot accommodate the passage of time as a continuous measure and depend on artificially delimited trials (Gallistel and Gibbon 2000). Attempts to overcome this limitation have yielded hybrid theories that mix associations with models of timing (e.g., Machado 1997; Kirkpatrick and Church 2000; Kirkpatrick 2002). These include the SOP model (Wagner 1981; Brandon et al. 2003), the “elemental model” of McLaren and Mackintosh (2000), and the SOCR model (Stout and Miller 2007). In general, associative models are additive over consecutive trials, and use asymptotically bounded functions within trial. These diminishing-returns functions resemble those used to study temporal discounting. A simplified example of this isomorphism is as follows: If a subject accrues one “unit” of associative strength for a 1-sec stimulus and displays a 10% discounting rate per second, a 2-sec stimulus is expected to have 1.9 (1 + 0.9) units, and a 4-sec stimulus would be expected to have 3.4 (1 + 0.9 + 0.81 + 0.73) units. The accumulated association over 10 fixed 4-sec trials would thus be 34 units. Meanwhile, an exponential distribution of 10 trials with a mean of 4 sec would only yield 30 units as a result of discounting (see Jennings et al. 2013 for further elaboration). This “accumulated association” approach is appealing because it predicts such phenomena as blocking, overshadowing, latent inhibition, and the emergence of timed responding. It also resembles Hebbian synaptic strengthening via a long-term potentiation (LTP) and provides an account for spatial memory (Abel and Lattal 2001; Whitlock et al. 2006; Sanderson et al. 2008, 2010). However, models of additive association fail to account for other aspects of conditioning. For example, although subjects learn during trace conditioning procedures and respond to both CS and Trace intervals (Flesher et al. 2011), these models incorrectly predict that trace interval responding should decline following the CS offset, as associative strength declines (Vogel et al, 2004). Associative models also predict that many short trials should facilitate learning relative to a few long trials (as shown by simulation in Gottlieb 2008), but experimental subdivision of CS presentations into long or short intervals has repeatedly been shown to be irrelevant to acquisition of conditioned responding. In one such demonstration, Lattal (1999) reported that trial duration (denoted by T, lasting from CS onset to CS offset) did not predict acquisition. Instead, acquisition also depended only on the ratio of T to the inter-trial interval (ITI) duration, denoted by I, even though absolute values of I and T varied considerably (see also Gibbon et al. 1977, 1997). More recent accounts use the average cycle time between reinforcer deliveries (denoted by C) and describe acquisition in terms of the C/T ratio (Balsam et al. 2006). Across these studies, only cumulative CS and ITI time, as well as CS–US contingency, are required to predict acquisition speed (Gottlieb 2008). These results are consistent with Rate Expectancy Theory (RET) (Gallistel and Gibbon 2000), which eschews associations and argues that conditioned responding reflects knowledge of temporal rates and conditional probabilities. Subsequent development of this approach has emphasized information theory (Balsam et al. 2010; Jensen et al. 2013; Ward et al. 2013), which does not require dividing time artificially into trials. Further, although LTP is correlated with associative learning in some fashion, there is no direct evidence that LTP is itself the primary mechanism of memory (Gallistel and Matzel 2013). New techniques for studying associative learning and long-term memory (Corrêa et al. 2012; Reichelt and Lee 2012; Alberini and LeDoux 2013; Bock et al. 2013; da Silva et al. 2013) and the rise of synaptic tagging (Doyle and Kiebler 2011; Päpper et al. 2011) suggest that LTP is neither rapid nor flexible enough to account for the transition from unconditioned to conditioned responding (London and Hausser 2005). Although the phenomenon of Pavlovian conditioning is well established, models that rely on association by contiguity have struggled to characterize the temporal factors that reliably predict such conditioning. Thus, understanding how time relates to Pavlovian conditioning is central to identifying its physiological basis. The C/T ratio predicts acquisition time across many experiments performed by different labs (for review, see Balsam et al. 2010), but models make different predictions regarding the influence of trial-to-trial specifics. For example, Balsam and Gallistel (2009) proposed that fixed temporal intervals were objectively more informative than variable intervals. Despite this prediction, however, Ward et al. (2012) manipulated whether the C and T intervals were fixed or variable, but found that only the overall C/T ratio was predictive of acquisition. They concluded that the subjective informativeness of a stimulus could be quantified by the average time to reinforcement given the stimulus (T) relative to the overall time to reinforcement in general (C), analogous to the mutual information between the overall rates of CS presentation and reinforcement (Jensen et al. 2013). Jennings et al. (2013) presented a contrary view. They tested whether acquisition of conditioned responding depended on either fixed or variable CS durations. Over four experiments, they reported higher response rates given conditioned stimuli of fixed durations than of variable durations. Although they also reported that fixed stimuli yielded faster acquisition than variable stimuli under some conditions, four factors complicate the results. First, rate estimates were computed, on a trial-by-trial basis, as “difference scores” (estimated CS rate minus estimated ITI rate). These estimates were sampled over narrow windows of time, and displayed high variability. Second, this variability was inconsistent trial-to-trial due to variations in the CS duration, violating the statistical assumption that observations are identically distributed. Third, Jennings et al. (2013) applied a smoothing function to their data to reduce the variance, violating the assumption that observations were independent of one another. The fourth and most serious difficulty is the use of group averages to describe behavior over time. In most cases, “learning curves” averaged across subjects bear little relationship to the behaviors exhibited by individual subjects (Gallistel et al. 2004). Consequently, it is unclear how to interpret the curve-fitting used by Jennings et al. (2013) to estimate acquisition speeds, since the resulting curve is unlikely to resemble acquisition of individual subjects. The present study undertook a within-subject comparison of acquisition speed given fixed and variable CS durations, presented in a multiple schedule. CS durations were scheduled to ensure equal cumulative exposure to a Fixed and a Variable CS distribution within each session, and employed a range of C/T ratios. Our prediction was that no substantive difference should arise as a result of fixed or variable stimulus durations, and that the C/T ratio should, instead, explain any differences observed in response rate and time to acquisition. Forty naïve mice that were food restricted and maintained at 85%–90% of their ad libitum body weight were employed. Subjects were divided into five groups of eight. Each trial presented subjects with one of three schedules, consisting of an ITI followed by a stimulus. In all cases, the ITI consisted of a variable component (averaging 60 sec) and an additional fixed component (whose duration differed across groups). In each Fixed schedule (F+), the CS was a 20-sec stimulus, whereas in the Variable schedule (V+), CS durations were exponentially distributed with a mean of 20 sec. Both the F+ and V+ conditions delivered a sucrose pellet at the time of CS offset. In the Control schedule (C–), a variable ITI preceded a variable 20-sec stimulus, but no reinforcement was provided; this condition tested whether subjects could discriminate among the stimuli, rather than generalizing. Subjects received no previous training with these stimuli, and the sessions described represent their entire learning history involving these cues. Figure 1 depicts the schedule parameters for each of the five groups. In all schedules, head entries to the pellet feeder were recorded in order to identify anticipatory CRs to each stimulus. The experiment was carried out in four identical operant chambers.
Figure 1.

Schedule parameters for Groups 1–5. Intervals are marked as “F” for fixed intervals and “V” for variable intervals, with mean durations indicated in each case. The width of each segment is proportional to the mean intervals. Stimuli were limited during the intervals marked as black boxes (fixed) or white boxes (variable), and reinforcer delivery occurred at stimulus offset (marked with a gray circle). Groups 1, 3, 4, and 5 use 2-kHz and 10-kHz tones as stimuli, counterbalanced within group. Group 2 uses 6 kHz and an LED light as stimuli, also counterbalanced. Not shown is the C– condition. Its ITI was V-60s + F-100s, and the CS was a variable 20-sec stimulus. Group 2 used the houselight as the C– stimulus, while all other groups used a 6-kHz tone.

Schedule parameters for Groups 1–5. Intervals are marked as “F” for fixed intervals and “V” for variable intervals, with mean durations indicated in each case. The width of each segment is proportional to the mean intervals. Stimuli were limited during the intervals marked as black boxes (fixed) or white boxes (variable), and reinforcer delivery occurred at stimulus offset (marked with a gray circle). Groups 1, 3, 4, and 5 use 2-kHz and 10-kHz tones as stimuli, counterbalanced within group. Group 2 uses 6 kHz and an LED light as stimuli, also counterbalanced. Not shown is the C– condition. Its ITI was V-60s + F-100s, and the CS was a variable 20-sec stimulus. Group 2 used the houselight as the C– stimulus, while all other groups used a 6-kHz tone. Estimates of ITI response rates were limited to the 20-sec interval prior to CS onset (the “pre-CS interval”). Subjects received no cues to signal the onset of the pre-CS interval, whose duration was selected merely to facilitate comparison with responding during the CS interval (see Supplemental Material for full details). A repeated measure ANOVA of the difference scores of CS minus ITI response rates was performed separately for each group, using stimulus type (F+ vs. V+) as a between-subject factor and session number as a within-subject factor. In all groups, a significant effect for session number was identified (FGrp1[1,3] = 16.14, FGrp2[1,4] = 20.96, FGrp3[1,5] = 20.81, FGrp4[1,5] = 10.55, FGrp5[1,5] = 17.46, all P < 0.001), suggesting that all groups acquired conditioned responding over time. Groups 1, 2, and 3 (for whom F+ and V+ yielded identical durations) showed no main effect for stimulus type (FGrp1[1,14] = 0.3, FGrp2[1,14] = 1.31, FGrp3[1,14] = 0.01, all P > 0.25) and no interaction between stimulus type and session number (FGrp1[14,42] = 0.06, FGrp2[14,56] = 0.12, FGrp3[14,70] = 0.22, all P > 0.95). A marginal main effect for stimulus type was observed in Group 4 (FGrp4[1,14] = 3.01, P = 0.1) and significant effect was observed in Group 5 (FGrp5[1,14] = 11.15, P < 0.005). Correspondingly, no interaction was observed in Group 4 (FGrp4[14,70] = 0.67, P = 0.6), but a significant interaction was observed in Group 5 (FGrp5[14,70] = 2.47, P < 0.05). Although these repeated-measures ANOVAs test for effects in each group individually, they are not omnibus tests of the hypotheses that (1) response rate was similar regardless of stimulus type, and (2) that changes in response rate were similar over time. To examine these, a hierarchical mixed-model ANOVA was performed on difference scores for all subjects. Rates were calculated over blocks of 10 consecutive trials. Fixed effects were calculated for block (eight blocks of 10 trials), stimulus type (F+ vs. V+), and group (1–5), while a random effects per-subject factor was nested within the group factor. All interactions among the fixed effects were included. The raw data, complete results, and the formal specification of this analysis are provided in the Supplemental Material; of these, two results merit particular attention. Contrary to our expectation, a significant difference was detected for stimulus type (Fstim[1,525] = 6.41, P < 0.02). Although significant, this effect was very weak, explaining <1% of the sample variance (η2 = 0.004), as compared to the effect of block (η2 = 0.235) and group (η2 = 0.177). Additionally, there was not a significant interaction between block and stimulus type (Fstim[7,525] = 1.01, P = 0.42). From these, we conclude that although a difference in the response rate may arise from fixed vs. variable stimuli, its effect size was negligible and did not impact when conditioned responding emerged. The session-wise response rates for each group are depicted in Figure 2. However, because these learning curves are group averages, they do not represent acquisition in a typical subject. In order to quantify individual acquisition, a Bayesian change-point analysis was also performed for each subject, modifying the CPR algorithm (Jensen 2013). The CPR algorithm first identified whether a change-point was appropriate by computing the marginal likelihood for both a no-change model and a model that included acquisition. When the evidence sufficiently favored introducing a change-point, it then identified the acquisition trial using maximum likelihood.
Figure 2.

Difference scores for response rates (CS rate minus rate during a 20-sec pre-CS interval) of Groups 1–5, depicted across sessions on the Fixed+ (white points, solid lines), Variable+ (gray points, dashed lines), or Control– (black points, dashed lines) trials. Error bars indicate one standard error.

Difference scores for response rates (CS rate minus rate during a 20-sec pre-CS interval) of Groups 1–5, depicted across sessions on the Fixed+ (white points, solid lines), Variable+ (gray points, dashed lines), or Control– (black points, dashed lines) trials. Error bars indicate one standard error. Because the number of responses in a given interval is discrete, it is properly modeled as a Poisson distribution. Prior to acquisition (when subject does not treat the CS as informative), the Bayesian analysis presumed that responses were drawn from a single Poisson distribution with a single rate parameter. After acquisition, the analysis presumed that two Poisson processes were necessary to describe the differing rates of CS and ITI responding. Because marginal likelihoods automatically correct for model parsimony, the CPR algorithm favored the single distribution model until the CS and ITI response rate differed unambiguously. Each subject was analyzed in isolation to obtain its distinct change-points for the F+, V+, and C– conditions. Figure 3A plots cumulative responses as a function of cumulative time in the pre-CS interval (left) and as a function of cumulative CS time (right) for Subject 5 (in Group 1), whose responding was characteristic. Solid lines indicate the F+ schedule, and dashed lines indicate the V+ schedule. The arrows indicate the change-point identified by the CPR algorithm for each of the schedules (trial 26 for the F+ schedule and trial 32 for the V+ schedule). When plotted as cumulative exposure time, the two conditions yield almost indistinguishable patterns of responding. Figure 3B plots the corresponding data for Subject 25, whose performance was representative of subjects in Group 4. The different acquisition times for the two schedules are unambiguous: Conditioned responding was acquired on trial 46 for the F+ schedule and on trial 15 for the V+ schedule.
Figure 3.

Change-point analysis for the acquisition of conditioned responding. (A) Cumulative responses for a representative subject in Group 1 during the pre-CS interval of the ITI (left) and the CS presentation (right) for Fixed+ trials (solid line) and Variable+ trials (dashed line). The arrows indicate the point identified by the change-point analysis to be the likely point of acquisition. (B) An identical plot to that in A, showing results for a representative subject in Group 4, when F+ and V+ were expected to be learned at different times. (C–F) Cumulative plots showing the proportion of subjects who had acquired by a given trial. Given their similarity, Groups 1 and 2 were pooled in C.

Change-point analysis for the acquisition of conditioned responding. (A) Cumulative responses for a representative subject in Group 1 during the pre-CS interval of the ITI (left) and the CS presentation (right) for Fixed+ trials (solid line) and Variable+ trials (dashed line). The arrows indicate the point identified by the change-point analysis to be the likely point of acquisition. (B) An identical plot to that in A, showing results for a representative subject in Group 4, when F+ and V+ were expected to be learned at different times. (C–F) Cumulative plots showing the proportion of subjects who had acquired by a given trial. Given their similarity, Groups 1 and 2 were pooled in C. Figure 3C–F show cumulative acquisition for different groups. Groups 1 and 2 are pooled in Figure 3C, as both groups appeared to belong to the same distribution. In general, subjects with lower C/T ratios took longer to acquire, but no substantial difference in acquisition was apparent between the F+ and V+ schedule. The C– schedule is not shown because only three subjects out of the 40 had detectable acquisition (two in Group 1 and one in Group 2), which were all inhibitory with respect to the CS. Testing whether the F+ and V+ conditions differed was complicated by those subjects who did not acquire at all. Because these subjects may have acquired given enough time, the most appropriate course was to treat them as having “off-scale measurements” and to use a nonparametric test of within-subject differences. We used the Wilcoxon signed-rank test (Wilcoxon 1945) and set all off-scale values for “trial of acquisition” to 200 (although any very large number yields identical results). Additionally, a two-sample Kolmogorov–Smirnov test was performed as an omnibus test of whether change-points differed across groups. Because this test makes no distributional assumptions, it permits a test of whether the distribution of F+ change-points differed in any way from that of the V+ change-points, pooled across groups. We did not find a significant difference as a function of stimulus type (DF+,V+ = 0.233, P = 0.270). Groups 1 and 2, when pooled, showed no effect of F+ vs. V+ (P = 0.46). Group 3 also fell short of significance (P = 0.06). However, significant differences were observed in Group 4 (P < 0.03), and Group 5 (P < 0.04). The V+ condition was acquired first in Group 4, whereas the F+ condition was acquired first in Group 5. The significant differences observed in Groups 4 and 5, as well as the direction of those effects, were in line with expectation. However, the marginal result in Group 3 was surprising, given the small difference visible in Figure 3D. In practice, six out of the eight subjects in Group 3 acquired on the second schedule within four trials of acquiring on the first, so our interpretation is either that this effect is consistently small, or would evaporate with additional data. If anything, this marginal result points to faster acquisition in the V+ group, a result in the opposite direction of associative accounts. To summarize, we did not observe differences in response rates as a function of stimulus type when the stimulus durations for F+ and V+ were identical (Groups 1, 2, and 3), and although a difference was detected in our omnibus analysis, its effect size was trivially small. When the C/T ratios differed (Groups 4 and 5), faster responding was associated with the larger C/T ratio. According to change-point analyses performed for each subject, time to acquisition was also chiefly determined by the C/T ratio for each stimulus. In conclusion, the present data demonstrate, using within-subject comparisons, that appetitive Pavlovian conditioning is faster when longer ITIs precede the CS onset and that, when lifetime CS exposure is equated, there is no difference in conditioning to CSs of fixed or variable duration. Because subjects did not acquire a conditioned response to our control (C–) group, the lack of a difference could not be attributed to simple stimulus generalization; the differences observed in Groups 4 and 5 also suggest that responding was under stimulus control.
  34 in total

Review 1.  Molecular mechanisms of memory acquisition, consolidation and retrieval.

Authors:  T Abel; K M Lattal
Journal:  Curr Opin Neurobiol       Date:  2001-04       Impact factor: 6.627

2.  Stimulus representation in SOP: I. Theoretical rationalization and some implications.

Authors:  Susan E. Brandon; Edgar H. Vogel; Allan R. Wagner
Journal:  Behav Processes       Date:  2003-04-28       Impact factor: 1.777

3.  Sometimes-competing retrieval (SOCR): a formalization of the comparator hypothesis.

Authors:  Steven C Stout; Ralph R Miller
Journal:  Psychol Rev       Date:  2007-07       Impact factor: 8.934

4.  Learning the temporal dynamics of behavior.

Authors:  A Machado
Journal:  Psychol Rev       Date:  1997-04       Impact factor: 8.934

5.  Appetitive Pavlovian goal-tracking memories reconsolidate only under specific conditions.

Authors:  Amy C Reichelt; Jonathan L C Lee
Journal:  Learn Mem       Date:  2012-12-21       Impact factor: 2.460

Review 6.  The neuroscience of learning: beyond the Hebbian synapse.

Authors:  C R Gallistel; Louis D Matzel
Journal:  Annu Rev Psychol       Date:  2012-07-12       Impact factor: 24.137

7.  Time and Associative Learning.

Authors:  Peter D Balsam; Michael R Drew; C R Gallistel
Journal:  Comp Cogn Behav Rev       Date:  2010

8.  Conditioned [corrected] stimulus informativeness governs conditioned stimulus-unconditioned stimulus associability.

Authors:  Ryan D Ward; C R Gallistel; Greg Jensen; Vanessa L Richards; Stephen Fairhurst; Peter D Balsam
Journal:  J Exp Psychol Anim Behav Process       Date:  2012-04-02

9.  Is the number of trials a primary determinant of conditioned responding?

Authors:  Daniel A Gottlieb
Journal:  J Exp Psychol Anim Behav Process       Date:  2008-04

10.  Temporal maps and informativeness in associative learning.

Authors:  Peter D Balsam; C Randy Gallistel
Journal:  Trends Neurosci       Date:  2009-01-10       Impact factor: 13.837

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.