Literature DB >> 23226318

Dynamic changes in single unit activity and γ oscillations in a thalamocortical circuit during rapid instrumental learning.

Chunxiu Yu¹, David Fan, Alberto Lopez, Henry H Yin.

Abstract

The medial prefrontal cortex (mPFC) and mediodorsal thalamus (MD) together form a thalamocortical circuit that has been implicated in the learning and production of goal-directed actions. In this study we measured neural activity in both regions simultaneously, as rats learned to press a lever to earn food rewards. In both MD and mPFC, instrumental learning was accompanied by dramatic changes in the firing patterns of the neurons, in particular the rapid emergence of single-unit neural activity reflecting the completion of the action and reward delivery. In addition, we observed distinct patterns of changes in the oscillatory LFP response in MD and mPFC. With learning, there was a significant increase in theta band oscillations (6-10 Hz) in the MD, but not in the mPFC. By contrast, gamma band oscillations (40-55 Hz) increased in the mPFC, but not in the MD. Coherence between these two regions also changed with learning: gamma coherence in relation to reward delivery increased, whereas theta coherence did not. Together these results suggest that, as rats learned the instrumental contingency between action and outcome, the emergence of task related neural activity is accompanied by enhanced functional interaction between MD and mPFC in response to the reward feedback.

Entities: Chemical Disease Gene Mutation Species

Mesh：

Year: 2012 PMID： 23226318 PMCID： PMC3511528 DOI： 10.1371/journal.pone.0050578

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

Introduction

Extensive evidence implicates the prefrontal cortex (PFC) in the organization of goal-directed behavior [1], [2], [3]. But its functional interaction with other brain regions remains poorly understood. In particular, the mediodorsal thalamus (MD), a structure with extensive reciprocal connections with the mPFC, has been implicated in the learning of goal-directed actions [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14]. MD and PFC are critical components in both the associative and limbic thalamocortico-basal ganglia networks [15], [16], [17], [18], [19], [20]. Previous lesion studies have implicated both the MD and medial prefrontal cortex (mPFC) in the acquisition and performance of reward-guided actions. MD lesions impaired learning of stimulus-reward associations [21], [22], [23] and action-reward associations [4], [24], [25]. Lesions of the mPFC lesions can also produce similar effects [2], [4], [26], [27], [28], [29], [30]. However, despite their well established anatomical connectivity, the functional interaction between MD and mPFC during goal-directed behavior remains poorly understood, because no previous study has recorded activity from both regions simultaneously during goal directed behavior. Based on previous work [4], [28], we hypothesized that instrumental learning is accompanied by significant changes in the coordination of medial prefrontal and mediodorsal thalamic activity. We predicted that, as rats learn to perform reward-guided actions, activity in both regions will change to reflect the acquisition of the action-outcome instrumental contingency. To test this hypothesis, we chronically implanted miniaturized multi-electrode arrays (up to 64 channels) in rats to record from the MD and mPFC as they learned to press a lever to earn rewards. We recorded single unit activity as well as local field potential (LFP) chronically in both MD and mPFC as rats were trained to press a lever for food reward. We measured the oscillatory activity in these brain regions simultaneously across successive days of instrumental learning. Our results show that, in the MD-mPFC circuit, dynamic changes in both single unit spiking activity and oscillatory LFP response in neuronal populations accompany the learning of a new action.

Materials and Methods

Ethics Statement

All procedures were approved by the Institutional Animal Care and Use Committee at Duke University and followed National Institutes of Health guidelines (Protocol Number: A087-08-03).

Animals and Surgery

Eight male Long-Evans rats (∼3 months of age at the beginning of the experiments) were used: in 5 rats we recorded single unit and LFP activity from MD and mPFC simultaneously, and in 3 rats we recorded from MD only. Surgery was performed under general anesthesia with isoflurane (2%). A craniotomy was performed over the bilateral thalamic and/or cortical locations according to known stereotaxic coordinates (from bregma in mm the coordinates were MD AP -2.1–3.3; ML-1-1; mPFC AP 4.6–2.5; ML -1-1). The electrode arrays used in this study consisted of 4×8 or 2×8 platinum-coated tungsten microwire electrodes (35 µm diameter, Innovative Neurophysiology, NC), with 150 µm between microwires, and 200 µm between rows. The arrays were lowered to the appropriate stereotaxic depth (MD ∼5.0 mm, mPFC ∼2.5 mm,). Electrode placement was confirmed post-mortem after perfusion and fixation with 10% formalin, followed by Thionin staining in 100 µm coronal sections (Figure 1).

Figure 1

Electrode placement and behavioral results.

A, Coronal sections of the rat brain illustrating MD and mPFC electrode placements. The coordinates are based on a standard rat brain atlas [58]. The numbers indicate distance in mm from Bregma. MDC, mediodorsal thalamic nucleus, central part; MDM, mediodorsal thalamic nucleus, medial part; Cg1, cingulate cortex, area1; PrL; prelimbic cortex. B, Outcome devaluation test. Devalued, rats received 1 h of unlimited food pellets, same as earned by lever pressing. Non-devalued, rats did not receive any food for 1 h before test. Normalized rate of presses were the ratio of presses under each condition. Error bars indicate SEM.

Electrode placement and behavioral results.

In vivo Multi-electrode Recording during Instrumental Learning

Two weeks after surgery, rats were food deprived and maintained at ∼85% of free feeding weight throughout the experiments. Training took place in a Med Associates (St. Albans, VT) operant chamber designed for in vivo extracellular recording. The chamber was equipped with a food magazine that received 45 mg dustless precision pellets (Bio-Serv, NJ) from a pellet dispenser and two retractable levers on either side of the magazine and a 3 W 24 V house light mounted on the wall opposite the levers and magazine. A computer with the Med-PC-IV program was used to control the equipment and record behavior. Time stamps for lever pressing behavior and reward delivery were sent as TTL pulses to the Blackrock Cerebrus data acquisition system. Lever press training consisted of four daily sessions under a continuous reinforcement schedule (CRF, each press earns one food pellet). Each session started with illumination of the house light and insertion of the lever, and ended with turning off the house light and retraction of the lever after 120 minutes or 100 earned pellets (whichever came first). The amount of training used was based on previous work on instrumental conditioning, which showed that performance was goal-directed following limited training [31]. In a pilot experiment, we also verified the goal-directed control of the instrumental performance using an outcome devaluation procedure. Rats (n = 4) were given a 90-min pre-feeding session using the same pellets as the training sessions. They were then tested on a 2-min probe test conducted in extinction, i.e. without any reward delivery. Single-unit and LFP activity were recorded using the Cerebrus data acquisition system (Blackrock Microsystems). For 4×8 electrodes arrays, a TBSI (Triangle Biosystems) gain 2 headstage were used. For 2×8 arrays, the Blackrock gain 1 headstage were used, as recently described [32], [33]. In brief, the data were filtered with both analog and digital bandpass filters (analog high-pass first order Butterworth filter at 0.3 Hz, analog low-pass third order Butterworth filter at 7.5 kHz) and sampled at 30 kHz. Single unit data was separated with a high-pass digital filter (fourth order Butterworth filter at 250 Hz), while local field potential (LFP) signals were filtered with a third order high-pass filter and seventh order low-pass filter (0.1 Hz–5 Hz cutoffs). Spikes were sorted using Offline Sorter (Plexon) and single-unit activity was isolated on the basis of principal component analysis. Only single-unit activity with a clear separation from noise was used for the analysis. Matlab was used to remove 60 Hz line noise and large transient artifacts in the LFP data: 60 Hz noise was removed using a blocked least mean squares (LMS) adaptive filter algorithm. The reference signal for the adaptive filter was created by finding the peak frequency of the LFP signal near the expected line noise frequency, and creating a sinusoidal reference signal with that frequency. The step size of the LMS algorithm was estimated by running the algorithm on a portion of the input signal for a range of varying step sizes, and using the step size that yielded the lowest RMS value of the error. Large transient motion artifacts were removed by subtracting a 20-sample moving window average around portions of the line-noise filtered signal with amplitude of greater than 6 standard deviations from the mean.

Data Analysis

Neuronal data analysis was performed with Neuroexplorer (Nex Technologies), Microsoft Excel, Graphpad Prism (GraphPad Software), and MATLAB (MathWorks). Neural activity was averaged in 50-ms bins, averaged across trials, and smoothed with a Gaussian filter to construct the Peri-Event histogram. To classify “action initiation” neurons, neural activity within 500 ms before the onset of lever pressing was compared to a baseline window from 1500 ms to 1000 ms before the lever press (two tailed t test was used, p<0.01). To classify "reward delivery" neurons, neural activity within a 1000 ms window after reward delivery was compared with a baseline window from 2000 to 1000 ms before reward delivery. The time windows used were based on visual inspection of the data. Spectral analysis of LFP power and coherence was performed by using Neuroexplorer. The power spectra were calculated using Welch’s method (512 frequencies between 1 and 100 Hz, smoothed with a Gaussian Kernel with bin width 3). Coherence is a measure of the linear correlation between two signals as a function of frequency [34], [35]. Coherence between two signals is calculated by dividing the cross-spectral density function by the auto –spectral density function. The cross spectrum between two time series and the auto-spectrum of each signal are obtained by calculating the product of the Fast Fourier transformed series. The signals are then subdivided into time intervals of length equal to the number of frequency samples divided by the maximum frequency, and the spectra are estimated by averaging the spectrum over these intervals (Welch’s method). The coherence measure is sensitive to both a change in power and a change in phase relationships. Consequently, if either power or phase changes in one of the signals, the coherence value is affected. In our study, Coherence analysis between LFPs from two regions was performed using 512 frequencies between 1 and 100 hz with a 5% overlap window, smoothed with a Gaussian kernel with bin width = 3.

Results

Behavior

All rats were naive when training began. Within the very first session of training, they learned to press the lever for reward, and their performance improved over 4 days. Previous work has established that with such limited training, instrumental behavior is highly goal-directed, sensitive to devaluation of the outcome 31,36. In a separate experiment, we assessed the effect of outcome devaluation on lever pressing with limited training. After the same amount of CRF training, rats were given 1 hour of exposure to unlimited amount of food pellets just before a 2-min probe session conducted in extinction. Outcome devaluation by pre-feeding significantly reduced instrumental performance (n = 4, paired t test, p = 0.01; Figure 1B), suggesting that with the amount of training used in this study the performance is controlled by the action-outcome instrumental contingency.

Electrode Placement

In 5 rats, MD and mPFC were recorded simultaneously, with each array covering both sides of the brain. Three rats were implanted in the MD only. Histological analysis showed clear electrode tracks and recoding sites in MD and mPFC (mainly prelimbic and infralimbic regions), but not in the anterior cingulate cortex (Figure 1A). We recorded from a total of 268 neurons from MD (n = 69, 71, 66, 62 for each recording session) and 170 neurons from mPFC (n = 44, 45, 44, 37 per session). Based on the waveform differences over days from the same electrode, new neurons were considered to be recorded each day.

Changes in Single Unit Activity during Acquisition

Single unit neural activity was recorded starting with the 1st session of CRF training. All rats learned to press a lever for food pellets within 4 sessions of training. In the beginning very few neurons were task related. With training, however, the neural activity in both MD and mPFC changed dramatically. The most common type of task related modulation was found in response to reward delivery. Figure 2A shows the dynamic changes of the firing rate of all recorded neurons upon the reward delivery across four consecutive sessions. Interestingly, the firing rates of mPFC neurons increased with learning (one-way ANOVA, Kruskal-Wallis test, p = 0.02), but not those of MD neurons (Kruskal-Wallis test, p = 0.71).

Figure 2

Neural plasticity in MD and mPFC during learning.

Neural plasticity in MD and mPFC during learning.

A, The firing rates of all responsive neurons in the MD (n = 69, 71, 66, 62 for each recording session) and mPFC (n = 44, 45, 44, 37) upon the reward delivery during acquisition. B, Dramatic increase in overall percentage of neurons whose activity is modulated by the reward delivery (both excited and inhibited). The rate of lever pressing across 4 training sessions is also plotted MD and mPFC activity was recorded simultaneously from 5 rats. MD activity only was recorded from 3 rats. Error bars represent SEM. In both regions, many units responded after the termination of lever press and the delivery of the reward (Figure 2B). The reported increase was observed even when only the first 30 presses from each session were analyzed. Thus, the increased number of lever presses in the later sessions was not responsible for producing this effect. Representative waveforms of the single units are shown in Figure 3A. We found 42 MD neurons and 28 mPFC neurons that were significantly excited by the reward delivery. On the other hand, there were 32 MD neurons and 17 mPFC neurons that reduced firing after reward delivery (Figure 3). Some neurons exhibited clear increased responses to reward delivery even within a single session after learning (Figure 4).

Figure 3

Neuronal activity in MD and mPFC during acquisition.

A, Left, action potential waveform and distribution of interspike intervals of representative neurons recorded from MD and mPFC. Right, Perievent raster plots of representative neurons. Each row in raster plot represents a single trial. Green line represents time of reward delivery. Reward excited neuron increases firing after the completion of lever press action and delivery of reward. Reward inhibited neuron decreases firing upon the delivery of reward. B, Top, spike density functions of individual neurons that transiently increased (MD n = 42; mPFC n = 28) or decreased (MD n = 32; mPFC n = 17) activity following reward delivery. Each row shows a z-score normalized spike density function for a single neuron. The neurons are sorted by the latency to the maximum or minimum amplitude. Bottom, normalized population firing rate of reward excited and inhibited neurons at the time of reward delivery. Shaded areas indicate SEM.

Figure 4

Changes of neuronal responses within a single session.

Perievent histogram of representative MD (top) and mPFC (bottom) reward related neurons during the first 30 presses (green) and last 30 presses (red) during the second or third acquisition session.

Neuronal activity in MD and mPFC during acquisition.

Changes of neuronal responses within a single session.

Perievent histogram of representative MD (top) and mPFC (bottom) reward related neurons during the first 30 presses (green) and last 30 presses (red) during the second or third acquisition session. We also analyzed single unit activity just before the lever press. We found that the activity of fewer neurons was modulated by action preparation and initiation. We found 15 "excited" neurons in the MD and 3 in the mPFC; and 17 "inhibited" neurons in the MD and 10 in the mPFC.

Changes in LFP during Learning

We also examined changes in LFP during learning. We recorded from 14 mPFC channels from 5 rats in which MD and mPFC were simultaneously recorded, and from 22 MD channels from 5 MD-mPFC and 2 MD rats (1 rat was excluded because of excessive noise in the LFP recording). Representative peri-event histograms are displayed in Figure 5A. Upon reward delivery, a prominent dip was observed in the LFP, indicating a net depolarization in the subthreshold activity of the neuronal population. As shown in Figure 5B, this depolarization increased in the course of learning. The effect was observed when we only analyzed the same number of presses from the first session and the last session, to rule out any differences due to the increase in the number of presses during learning.

Figure 5

Learning-related modulation in LFP activities.

A, Perievent raster plots of representative LFP recording. Both examples display depolarization following reward delivery. B, Perivent histograms of representative LFP recorded from 3 rats during the first training session and the last session (top, MD; bottom, mPFC). LFPs exhibit depolarization (negativity in the extracellular recording) with learning.

Learning-related modulation in LFP activities.

Dynamic Changes in Neural Oscillations Associated with Learning

In the MD, LFP showed strong theta oscillations (∼7–8 Hz) and weak gamma oscillations (∼50 Hz), whereas mPFC LFP showed the opposite pattern (Figure 6). More importantly, as shown in Figure 7, the overall oscillatory activity in both MD and mPFC changed dramatically during learning. In the MD, theta power increased during learning (Figure 7B, one-way ANOVA, F = 5.75, p = 0.002), but gamma power did not change significantly (F = 0.53, p = 0.66). In the mPFC, on the other hand, gamma oscillations became very pronounced after learning (Figure 7D, one-way ANOVA, F = 4.60, p = 0.008), but no significant changes were seen in the theta power (F = 1.15, p = 0.34).

Figure 6

LFP recording in MD and mPFC during behavior.

Representative LFP traces recorded from the four electrodes during the final session. LFPs in the MD exhibit prominent theta band (∼7–8 Hz) oscillations, whereas LFPs in the mPFC show prominent gamma band (∼50 Hz) oscillations.

Figure 7

Dynamic changes in oscillatory activity during learning.

LFP recording in MD and mPFC during behavior.

Dynamic changes in oscillatory activity during learning.

A, Power spectral analysis of theta and gamma oscillations in the MD. Theta band oscillations increased during training, but gamma oscillations did not. First, first session; Last, last (4th) training session. Representative data are shown from one rat with simultaneous MD and mPFC recordings. B, Normalized (% of the first session) power of theta and gamma oscillations in the MD (n = 22) during acquisition. Theta oscillations in the MD increased significantly over time, whereas gamma oscillations did not. Data from all animals are averaged and shown here. Error bars present SEM. C, Power spectral analysis of theta and gamma oscillations in the mPFC. Representative data are shown from one rat with simultaneous MD and mPFC recordings. D, Normalized (% of the first session) power of theta and gamma oscillations in the mPFC (n = 14) during acquisition. There was a significant increase in the gamma oscillation but not in theta oscillations. In accord with our single unit recording data, we did not find significant modulation of the LFP during the action initiation period (just before the lever press). But gamma power in both MD and mPFC peaked upon the reward delivery. mPFC showed higher gamma power compared to MD. Two representative peri-event spectrograms are shown in Figure 8. LFP oscillations upon reward delivery (during the time window from the reward delivery to the start of the head entry into the food cup) in both MD and mPFC changed differentially across training sessions. In the MD, neither theta nor gamma power changed significantly during acquisition (repeated measures ANOVA, Fs <2.33, ps>0.05). By contrast, in the mPFC, gamma oscillations became more pronounced with training (repeated measures ANOVA, F = 3.21, p = 0.03). Interestingly, theta oscillations upon reward delivery were reduced (repeated measures ANOVA, F = 3.23, p = 0.03). These findings suggested that theta and gamma oscillations were differentially modulated throughout the training sessions.

Figure 8

Theta and gamma frequency oscillations.

(A and C) Perievent spectrograms of representative MD (A) and mPFC (C) LFP during the first (top) and last (bottom) session. MD Theta power is much stronger compare to mPFC. mPFC gamma power is much stronger compared to MD. After learning, gamma power is maximal at the time of reward delivery. (B and D) Changes of normalized power spectra of theta and gamma frequency oscillations in MD (n = 22) (B) and mPFC (n = 14) (D) upon the reward deliver across four sessions. Error bars indicate SEM.

Theta and gamma frequency oscillations.

Changes in Coherence between MD and mPFC Activity during Learning

To determine the dynamic interactions between MD and mPFC during learning, we analyzed the coherence between these areas across four sessions. Coherence can be used as an estimate of the strength of coupling between activities from two different brain regions. As shown in Figure 9, the overall coherence between MD and mPFC changed significantly during the course of learning. Theta coherence did not change significantly across sessions (repeated measures ANOVA, F = 2.40, p = 0.07). By contrast, gamma coherence was weak at first, but increased significantly with learning (repeated measures ANOVA, F = 3.39, p = 0.02). Next, we examine how the coherence between MD and mPFC was modulated by reward delivery across sessions. Gamma coherence upon reward delivery increased during learning (repeated measures ANOVA, F = 7.75, p = 0.0001), but theta coherence did not (F = 1.43, p = 0.24).

Figure 9

Changes in coherence between MD and mPFC during learning.

A, Left, Overall coherence from two representative electrodes during the first (green) and last (red) session in MD and mPFC. Right, dynamic changes of theta and gamma coherence during 4 sessions of instrumental learning (n = 33 pairs in each session). Error bars represent SEM. B, Left, coherence upon reward delivery measured from activity from two representative electrodes. Right, dynamic modulation of theta and gamma coherence by the reward delivery. Error bars indicate SEM.

Changes in coherence between MD and mPFC during learning.

Discussion

To understand the role of MD and mPFC in the acquisition of goal directed behavior, we recorded from both areas as rats learned to press a lever for food rewards. All rats learned to press the lever by the end of the first session, and progressively increased their rate of lever pressing (Figure 2). They were able to learn rapidly the relationship between the lever press and reward. Neural activity in this thalamocortical circuit changed dramatically during instrumental learning. Our results suggest that MD and mPFC form a functional circuit, with similar task-related activity which emerges in the course of learning. However, we also found significant differences in the pattern of oscillatory activity in these two regions, and above all in the dynamic changes of such activity during training. Such oscillatory activity was modulated by reward delivery. The coherence between MD and mPFC activity also changed significantly during the course of learning (Table 1).

Table 1

Summary of the changes in neural activity and LFP oscillation as result of learning.

Changes		MD	PFC
Neuronalactivity	firing rate	no change	increased
	within session activity	increased	increased
	# of excited neurons	42/268	28/170
	# of inhibited neurons	32/268	17/170
	percent neurons acrosstraining sessions	increased	increased
LFP power	overall theta	increased	no change
	overall gamma	no change	increased
	theta to reward	no change	no change
	gamma to reward	decreased	increased
Coherence	theta (overall & reward)	no change
	gamma (overall & reward)	increased

In our study, we recorded from completely naive rats learning to press the lever for the first time. We were thus able to collect data on how neural activity changed during the initial phase of instrumental learning, when the animal rapidly acquired the relationship between the lever press and reward delivery. It is important to note that performance of the action after initial acquisition is highly sensitive to changes in outcome value, as shown by our devaluation test. The lever pressing was therefore clearly goal-directed. The observed plasticity accompanies the acquisition of the action-outcome contingency. At the start of training, there was virtually no task related neurons in either MD or mPFC. However, as the rats learned to press the lever, many neurons in both regions increased or decreased their rate of firing in relation to lever pressing and reward (Figure 2). The LFP data (Figure 5), which show significant depolarization in the subthreshold activity in response to reward delivery, also suggest that the emergence of reward elicited activity is a widespread phenomenon. To our knowledge, this is the first report of significant plasticity in vivo in this thalamocortical circuit during instrumental learning. For the continuous reinforcement task used in this study, reward is delivered immediately upon the completion of the lever press. Surprisingly, although the firing rate of some neurons were modulated during the action initiation period (starting at 500 ms before the lever press), such neurons are rare in both MD and mPFC. Nor did we observe significant population activity (LFP) that was modulated by action initiation. In contrast, neurons that altered their firing activity following the completion of the action and the reward delivery were much more common, confirmed by the LFP recordings (Figures 3 and 5). These results suggest that the primary role of the MD-mPFC circuit is to signal the outcome of the goal directed behavior, in this case the reward feedback. This is in accord with previous work that learning of stimulus reward associations also requires the MD [21], [22], [23].

Changes in Oscillatory Activity in Local Field Potential Recording

Oscillatory activities in different frequency ranges are widely found in different brain areas and correlated with behavioral states [37], [38], [39], [40], [41]. Previous work has shown significant changes in oscillations during learning [41], [42], [43], [44]. Despite the similarities between MD and mPFC in their overall pattern of task-related activity, we observed striking differences between these areas in the dynamic changes in oscillatory LFP activity. Above all, gamma power increased in mPFC, but not in MD (Figure 8). Theta (6–10 Hz) oscillations are common in the prefrontal cortex and hippocampus, often found during exploration and learning [42], [45], [46]. Gamma (40–55 Hz) oscillations, on the other hand, have traditionally been linked to attention [44]. More recently, recording from the rat ventral striatum, Redish and colleagues found that gamma oscillations increased following the delivery of rewards and gamma power increased with learning on a maze task [40], [47]. Both our data and previous work from Redish and colleagues show reward-elicited increase in oscillations in the low gamma range of roughly 50 Hz. Since the mPFC sends excitatory projections directly to the ventral striatum, the reward-elicited increase in gamma power observed in the ventral striatum could in part be caused by this prominent corticostriatal projection. Theta oscillations has been also shown to working memory performance in rodents [48], monkeys [49] and humans [50]. Gamma oscillations in the PFC are hypothesized to play an important role in attention by enhancing the neuronal representation of attended sensory input and by regulating the communication among neuronal groups in distinct areas that convey the behaviorally relevant information [46]. The coherence measure could reflect the functional interactions between different brain regions [51], [52]. When we measured the overall coherence between simultaneously recorded MD and mPFC LFP during the course of training, we found that theta coherence did not change, whereas gamma coherence increased with instrumental learning. When we examined the coherence in response to the reward delivery, we also found a significant increase in gamma coherence, but theta coherence did not change significantly across sessions (Figure 9). The enhanced gamma coherence could reflect excitatory inputs responsible for the increase in firing rate of single units immediately after reward delivery (Figure 3). Thus, an overall increase in gamma coherence between MD and mPFC in response to reward delivery is the most striking change in the LFP during initial acquisition. Such changes can have a major impact on effective communication between these two structures. Whether the increase in gamma coherence we observed reflects increased perceptual attention to essential environmental feedback for goal-directed actions, or plays a more critical role in the generation of the appropriate action, remains to be determined by future studies that manipulate online neural activity directly. In short, our data revealed that instrumental learning in a standard operant task is accompanied by dramatic changes in coordination of population activity between MD and mPFC. Few neurons in MD and mPFC changed their activity prior to the initiation of action, suggesting that this thalamocortical circuit is not critical for action initiation and selection, in agreement with the effects of lesions to these two areas [4], [28]. On the other hand, basal ganglia lesions are well known to impair action initiation [53]. Given the strong, projections from the mPFC to the ventral and medial striatal regions, signals representing behavioral outcomes (such as reward) could be transmitted to the basal ganglia, which plays an important role in the learning and expression of goal-directed actions [54]. The role of the MD-mPFC circuit therefore appears to be restricted to the signaling of the reward feedback following the action [55]. Our findings are also in agreement with previous lesion studies implicating MD and mPFC in the learning of the action-outcome contingency [4], [56], [57]. It is important to point out that this thalamocortical circuit alone is not sufficient for instrumental learning; a distributed circuit involving additional brain regions in the basal ganglia is needed [28], [54]. The present study therefore merely represents an initial step in elucidating the computational roles of the brain regions that are essential for the acquisition and expression of goal-directed behaviors.

55 in total

Review 1. The brainweb: phase synchronization and large-scale integration.

Authors: F Varela; J P Lachaux; E Rodriguez; J Martinerie
Journal: Nat Rev Neurosci Date: 2001-04 Impact factor: 34.870

2. Modulation of oscillatory neuronal synchronization by selective visual attention.

Authors: P Fries; J H Reynolds; A E Rorie; R Desimone
Journal: Science Date: 2001-02-23 Impact factor: 47.728

3. Mechanisms of action selection and timing in substantia nigra neurons.

Authors: David Fan; Mark A Rossi; Henry H Yin
Journal: J Neurosci Date: 2012-04-18 Impact factor: 6.167

Review 4. Nonlinear multivariate analysis of neurophysiological signals.

Authors: Ernesto Pereda; Rodrigo Quian Quiroga; Joydeep Bhattacharya
Journal: Prog Neurobiol Date: 2005-11-14 Impact factor: 11.685

5. Lesions of medial prefrontal cortex disrupt the acquisition but not the expression of goal-directed learning.

Authors: Sean B Ostlund; Bernard W Balleine
Journal: J Neurosci Date: 2005-08-24 Impact factor: 6.167

6. The primate mediodorsal (MD) nucleus and its projection to the frontal lobe.

Authors: P S Goldman-Rakic; L J Porrino
Journal: J Comp Neurol Date: 1985-12-22 Impact factor: 3.215

Review 7. Goal-directed instrumental action: contingency and incentive learning and their cortical substrates.

Authors: B W Balleine; A Dickinson
Journal: Neuropharmacology Date: 1998 Apr-May Impact factor: 5.250

Review 8. An ultrastructural study of the neural circuit between the prefrontal cortex and the mediodorsal nucleus of the thalamus.

Authors: M Kuroda; J Yokofujita; K Murakami
Journal: Prog Neurobiol Date: 1998-03 Impact factor: 11.685

9. The organization of the thalamocortical connections of the mediodorsal thalamic nucleus in the rat, related to the ventral forebrain-prefrontal cortex topography.

Authors: J P Ray; J L Price
Journal: J Comp Neurol Date: 1992-09-08 Impact factor: 3.215

10. Neurotoxic lesions of the medial mediodorsal nucleus of the thalamus disrupt reinforcer devaluation effects in rhesus monkeys.

Authors: Anna S Mitchell; Philip G F Browning; Mark G Baxter
Journal: J Neurosci Date: 2007-10-17 Impact factor: 6.167

7 in total

1. Adaptive Encoding of Outcome Prediction by Prefrontal Cortex Ensembles Supports Behavioral Flexibility.

Authors: Alberto Del Arco; Junchol Park; Jesse Wood; Yunbok Kim; Bita Moghaddam
Journal: J Neurosci Date: 2017-07-20 Impact factor: 6.167

2. Neural Representation of Odor-Guided Behavior in the Rat Olfactory Thalamus.

Authors: Emmanuelle Courtiol; Donald A Wilson
Journal: J Neurosci Date: 2016-06-01 Impact factor: 6.167

3. Mediodorsal Thalamic Neurons Mirror the Activity of Medial Prefrontal Neurons Responding to Movement and Reinforcement during a Dynamic DNMTP Task.

Authors: Rikki L A Miller; Miranda J Francoeur; Brett M Gibson; Robert G Mair
Journal: eNeuro Date: 2017-10-13

4. Bidirectional modulation of substantia nigra activity by motivational state.

Authors: Mark A Rossi; David Fan; Joseph W Barter; Henry H Yin
Journal: PLoS One Date: 2013-08-06 Impact factor: 3.240

5. Region-specific impairments in striatal synaptic transmission and impaired instrumental learning in a mouse model of Angelman syndrome.

Authors: Volodya Hayrapetyan; Stephen Castro; Tatyana Sukharnikova; Chunxiu Yu; Xinyu Cao; Yong-Hui Jiang; Henry H Yin
Journal: Eur J Neurosci Date: 2013-12-13 Impact factor: 3.698

Review 6. The olfactory thalamus: unanswered questions about the role of the mediodorsal thalamic nucleus in olfaction.

Authors: Emmanuelle Courtiol; Donald A Wilson
Journal: Front Neural Circuits Date: 2015-09-18 Impact factor: 3.492

7. Neural activity in mediodorsal nucleus of thalamus in rats performing a working memory task.

Authors: Jihyero Han; Ji Hyun Lee; Min Jung Kim; Min Whan Jung
Journal: Front Neural Circuits Date: 2013-08-06 Impact factor: 3.492

7 in total