The dorsal cochlear nucleus (DCN) integrates auditory nerve input with a diverse array of sensory and motor signals processed in circuitry similar to that of the cerebellum. Yet how the DCN contributes to early auditory processing has been a longstanding puzzle. Using electrophysiological recordings in mice during licking behavior, we show that DCN neurons are largely unaffected by self-generated sounds while remaining sensitive to external acoustic stimuli. Recordings in deafened mice, together with neural activity manipulations, indicate that self-generated sounds are cancelled by non-auditory signals conveyed by mossy fibers. In addition, DCN neurons exhibit gradual reductions in their responses to acoustic stimuli that are temporally correlated with licking. Together, these findings suggest that DCN may act as an adaptive filter for cancelling self-generated sounds. Adaptive filtering has been established previously for cerebellum-like sensory structures in fish, suggesting a conserved function for such structures across vertebrates.
The dorsal cochlear nucleus (DCN) integrates auditory nerve input with a diverse array of sensory and motor signals processed in circuitry similar to that of the cerebellum. Yet how the DCN contributes to early auditory processing has been a longstanding puzzle. Using electrophysiological recordings in mice during licking behavior, we show that DCN neurons are largely unaffected by self-generated sounds while remaining sensitive to external acoustic stimuli. Recordings in deafened mice, together with neural activity manipulations, indicate that self-generated sounds are cancelled by non-auditory signals conveyed by mossy fibers. In addition, DCN neurons exhibit gradual reductions in their responses to acoustic stimuli that are temporally correlated with licking. Together, these findings suggest that DCN may act as an adaptive filter for cancelling self-generated sounds. Adaptive filtering has been established previously for cerebellum-like sensory structures in fish, suggesting a conserved function for such structures across vertebrates.
The first central stage of mammalian auditory processing occurs within the dorsal
and ventral divisions of the cochlear nucleus[1]. Based on similarities in their evolution, development, gene
expression patterns, and anatomical arrangement, the DCN is considered to belong to a
class of so-called cerebellum-like sensory structures[2-6]. Other
cerebellum-like structures include the first central stages of electrosensory and
mechanosensory lateral line processing in several groups of fish. Numerous cell and
fiber types are shared by all of these cerebellum-like structures and the cerebellum
itself including: mossy fibers, granule cells, parallel fibers, Golgi cells, molecular
layer interneurons, and Purkinje or Purkinje-like cells. A hallmark of the circuitry of
cerebellum-like sensory structures is the integration of direct input from peripheral
sensory receptors (e.g. electroreceptors in the case of cerebellum-like structures in
fish and auditory nerve fibers in the case of DCN) with a diverse array of sensory and
motor signals conveyed by a granule cell-parallel fiber system.A primary site of this integration within DCN is the fusiform cell. Fusiform
cells are also the major output cell of DCN and project to higher stages of auditory
processing such as the inferior colliculus. The basilar dendrites of fusiform cells are
contacted by auditory nerve fibers, which form a tonotopic map within the deep layer of
DCN (Supplementary Fig.
1)[1, 6]. Their apical dendrites extend into a superficial
molecular layer where they are contacted by parallel fibers. Parallel fibers arise from
granule cells located in so-called granule cell domains (GCDs) around the margins of the
nucleus and cross through different tonotopic regions of DCN[4]. Granule cells receive a wide variety of signals,
both auditory and non-auditory, from mossy fibers originating in a number of different
brain regions[6]. Parallel fiber, but not
auditory nerve fiber synapses, have been shown to exhibit forms of long-term associative
synaptic plasticity in vitro[7-9]. Though previous
in vivo studies of DCN have extensively characterized auditory
response properties in anesthetized or decerebrate animals[10], much less is known about the functional
significance of its cerebellum-like circuitry[11-13].Some of the best clues come from studies of cerebellum-like structures associated
with electrosensory processing in fish. Such studies have shown that anti-Hebbian
synaptic plasticity acting on proprioceptive, electrosensory, and motor corollary
discharge signals conveyed by parallel fibers serve to cancel principal cell responses
to self-generated electrosensory inputs, e.g. those arising from the fish’s own
movements or electromotor behavior[14, 15]. Cancellation of self-generated
electrosensory inputs allows externally-generated, behaviorally relevant stimuli to be
processed more effectively. Guided by these results, we set out to test the hypothesis
that the cerebellum-like circuitry of the DCN functions to cancel responses to
self-generated sounds.To this end we developed a preparation to study neural responses to
self-generated sounds in the auditory brainstem of awake, behaving mice. We chose
licking behavior because it is stereotyped and repetitive, can be elicited in head-fixed
animals during electrophysiological recordings, and, as we demonstrate, generates sounds
which are a potential source of interference for the mouse auditory system.
Results
DCN neurons respond preferentially to external versus self-generated
sounds
We found that rhythmic licking generates sounds within the hearing range
of the mouse and that such sounds exhibit stereotyped spectral and temporal
profiles that were similar across mice (Fig.
1a, Supplementary
Fig. 2 and Supplementary Video 1). The temporal profile of the licking sound is
shown by the root mean squared (RMS) amplitude of the microphone recording
aligned to tongue contact with the lick spout (Fig. 1a, white trace). Though the exact physical
origin of the licking sounds was not determined, tongue-to-spout contact appears
not to be the main cause. As can be seen in both the spectrogram and RMS
amplitude trace from the representative mouse shown in Figure 1a, licking sounds typically consist of an
early component that begins before contact as well as a larger late component
that peaks ~50 ms after contact, during tongue retraction (Fig. 1a and Supplementary Fig. 2).
Figure 1
Self-generated sounds strongly affect VCN but not DCN neurons
(a) Average spectrogram of self-generated sounds during licking for
a representative mouse. Arrow and dotted line indicate time of tongue contact
with the lick spout. Solid white line indicates the root mean squared (RMS)
amplitude of the microphone recording. (b) Left,
dextran-conjugated Alexa 594 labeling (green) at recording
sites in DCN and VCN (arrowheads). DAPI, red.
Right, higher magnification of dashed white box on left showing a
labeled fusiform cell (arrowhead). (c) Example
ventral cochlear nucleus (VCN) unit response during licking.
Arrows and dotted lines indicate times of
tongue contact with the lick spout. Traces represent the microphone recording
(top), smoothed firing rate (middle), and the VCN
unit recording (bottom; scale: 30 μV). (d)
Top, average RMS amplitude of the licking sound during VCN unit
recordings (scale bar: 1 a.u.). Bottom, average VCN lick-triggered
firing rate (n = 21). Thin lines are s.e.m.
(e) Example DCN unit response during licking. Scale bar and
display same as in c. (f) Top, average
RMS amplitude of the licking sound during DCN unit recordings.
Bottom, average lick-triggered responses of all DCN units
(n = 25), excluding those exhibiting
complex-spikes. Compared to VCN units, DCN units exhibited smaller temporal
modulations related to licking (peak-to-trough firing rate for VCN: 43.8
± 26.9 Hz, n = 21; for DCN: 19.7 ± 19.9
Hz, n = 25, mean and S.D., P =
0.0005, Wilcoxon Rank Sum Test). Scale bar and display same as in
d. (g) Z-scored lick responses (see
Methods) were significantly smaller in DCN compared to VCN
units (P = 0.00002, Wilcoxon Rank Sum Test). Median
responses are indicated by solid lines.
To determine whether licking sounds evoke neural responses that could
interfere with auditory processing and, if so, whether such responses are
cancelled out in the DCN, we compared neural activity during licking in
well-isolated single-units in the ventral cochlear nucleus (VCN) and DCN. Since
VCN receives direct auditory nerve input but lacks cerebellum-like circuitry, we
hypothesized that VCN units would respond to acoustic stimuli regardless of
whether they are self- or externally-generated. Recording locations were judged
based on characteristic reversals of tonotopy at the DCN/VCN border[16, 17] and verified by iontophoresis of a dextran-conjugated
fluorescent dye (Fig. 1b, white
arrowheads indicate recording sites, Supplementary Fig. 3 and
Methods). Though unambiguous criteria for linking physiological
response properties with morphological cell classes have not yet been
established for the awake mouseDCN[18], several properties of the recorded units indicate that
they likely correspond to fusiform cells, including their high spontaneous
firing rates and purely excitatory responses to acoustic stimuli (Supplementary Fig. 4 and
Methods)[19-23].
Units exhibiting complex spikes, putative cartwheel cell interneurons[24, 25] (Supplementary Fig. 1), were also encountered and analyzed
separately. Results for complex-spiking units are reported in Figure 4.
Figure 4
Non-auditory responses related to licking in DCN complex-spiking
units
(a) Sound-evoked field potentials (50 ms, broadband noise, averaged
over 50 presentations) recorded in DCN of the same mouse before
(left) and after (right) surgical
deafening. Note complete absence of sound-evoked field potentials after
deafening. (b) Example DCN complex-spiking unit recorded during
licking in a surgically deafened mouse. Arrows and dotted lines indicate times
of tongue contact with the lick spout. Top trace, microphone
recording. Below, extracellular voltage from a DCN complex-spiking
unit (scale: 30 μV). i, ii, Expanded traces from boxed
regions showing complex spike (CS) (shaded rectangle) and simple spike (SS)
waveforms. (c,d) Lick-triggered SS and CS firing rates for two
complex-spiking units recorded in deafened mice. Thin lines are s.e.m. Gray
traces show the average lick-triggered response of shuffled spike trains. Data
in c are from same unit as example traces in b. Top
trace (black) is the RMS amplitude of the licking sound (scale bar
= 1 a.u.). (e) Summary of z-scored lick responses of 11
complex-spiking units recorded in 3 surgically deafened mice. 8 showed
significant lick responses in their SS firing and 3 showed significant lick
responses in their CS firing (α=0.01, see Methods).
Median responses are indicated by solid lines. (f) Overall CS
firing rates increased slightly during periods of licking in deafened mice
(n = 11, P =0.04, Wilcoxon
Signed Rank Test). (g) Lick-triggered SS and CS firing rates for a
complex-spiking unit recorded in a hearing mouse. Same display as in
c. (h) Mimic-triggered SS and CS firing rates for
a complex-spiking unit recorded in a hearing mouse. Same unit as shown in
g. (i) Summary of licking and mimic responses in
complex-spiking units recorded in hearing mice. Average Z-score responses to
licking (n = 23) were 8.2 ± 4.4 for SSs and 3.8
± 4.4 for CSs. Average Z-score responses to the mimic
(n = 12) were 10.1 ± 4.9 for SSs and 3.2
± 3.3 for CSs.
Consistent with the possibility that licking behavior causes significant
self-generated sounds, VCN units exhibited an overall firing rate elevation
during licking as well as firing rate modulations (Fig. 1c,d, blue traces) that tracked the RMS
amplitude of the licking sound (black traces). In contrast, DCN
units exhibited substantially weaker firing rate modulations during licking
(Fig. 1e–g, red
traces and circles). Though these results are
consistent with cancellation of self-generated sounds in DCN, an alternative
explanation is that differences between VCN and DCN responses during licking are
due to systematic differences in their auditory response properties.We evaluated this possibility in a subset of VCN and DCN units by
comparing activity during licking to activity during delivery of an
externally-generated acoustic stimulus with temporal and spectral properties
that roughly matched the licking sounds recorded across mice (Fig. 2a, Methods). This stimulus is
referred to henceforth as the lick mimic and was presented outside of licking
bouts, when the mouse was still. Though the match between actual sounds
generated by licking and the lick mimic is not expected to be perfect, for
example due to issues such as bone conduction, this stimulus nevertheless
provided a simple and principled means of comparing auditory responses in VCN
and DCN. Strong responses to the lick mimic were observed in both VCN (Fig. 2b,c, blue traces) and
DCN units (Fig. 2d,e, red
traces). The strength of responses during licking was highly
correlated with the strength of responses to the lick mimic in VCN units (Fig. 2f, blue circles). This
is exactly what is expected if VCN licking responses are indeed due to
self-generated sounds. In contrast, there was no significant correlation between
mimic and licking responses in DCN units (Fig.
2f, red circles), such that even units with strong
responses to the lick mimic failed to respond during licking. These observations
suggest that weaker responses to licking in DCN compared to VCN cannot be
explained by differences in auditory sensitivity between the two regions. What
then is the mechanism underlying the apparent reduction of responses to
self-generated sounds during licking behavior in DCN?
Figure 2
Responses to self-generated versus external sounds in VCN and DCN
(a) Spectrogram of the lick mimic generated from microphone
recordings from 5 mice (Methods). Overlaid white line represents
the RMS amplitude. (b) Example VCN unit response to the lick mimic.
Same unit as in Fig. 1c. Traces represent a
schematic of the RMS of the mimic (top), smoothed firing rate
(middle), and the VCN unit recording (bottom;
scale bar: 30 μV). (c), Top, schematic of the
RMS of the lick mimic. Bottom, average VCN unit response to the
lick mimic (n = 6). Thin lines are s.e.m. The lick
mimic was delivered at 12 dB SPL in all experiments. (d, e) Same
scale bar and display as b, c but for DCN unit
responses to the mimic (n = 13). Traces in
d are from same unit shown in Fig. 1e. VCN and DCN unit responses to the mimic were not
significantly different (P = 0.32, Wilcoxon Rank Sum
Test). (f) Responses to licking were highly correlated to those
observed in the same units to the lick mimic for VCN (n
= 6, P < 0.001, r = 0.95,
linear regression t-test) but not DCN recordings (n =
13, P = 0.79, r = 0.0007,
linear regression t-test).
One possibility is that the overall sensitivity of DCN units to sound is
reduced during licking behavior. Indeed, an overall suppression of auditory
responsiveness during behavior has been reported in a variety of
systems[26, 27], including the mouse auditory
cortex[28, 29]. To test this, we compared DCN unit
responses to an externally-generated acoustic stimulus (bandpassed noise
5–15 kHz, 15dB SPL) delivered either during licking (Fig. 3a, lick and noise) or when the
mouse was still (Fig. 3a, noise
alone). Responses to the acoustic stimulus were indistinguishable
under the two conditions (Fig. 3a,b). In
addition, overall firing rates in DCN units were similar when mice were licking
versus still (Fig. 3c). Together, these
results are inconsistent with an overall suppression of auditory sensitivity in
DCN during licking and point instead to a mechanism for selectively canceling
self-generated sounds.
Figure 3
DCN responses to acoustic stimuli are not suppressed during licking
(a) Example DCN unit response to an acoustic stimulus (bandpassed
filtered 5–15 kHz, 15 dB SPL) played while the mouse was still versus
during licking. Gray bar indicates stimulus duration. (b) No
differences in responses were observed when the mouse was still versus licking
(n = 9, P = 0.49, Wilcoxon
Signed Rank Test). (c) Overall firing rates for DCN units were
similar when the mouse was licking versus still.
Non-auditory signals related to licking revealed in DCN of deafened
mice
In addition to auditory nerve input, DCN receives non-auditory,
behavior-related signals conveyed by mossy fibers. Previous studies of
cerebellum-like structures in fish have shown that behavior-related signals
conveyed by mossy fibers serve to selectively cancel self-generated
electrosensory input[14, 15]. Though electrophysiological
correlates of non-auditory mossy fiber inputs to DCN have been characterized in
anesthetized or decerebrate preparations, e.g. using electrical stimulation of
somatosensory brain regions projecting to DCN[12, 13,
30], responses to
non-auditory inputs have not yet been demonstrated in awake, behaving animals.
To isolate non-auditory responses related to licking behavior we recorded from
DCN in deafened mice (n = 3). Deafening (see
Methods) was confirmed by a lack of observable behavioral
responses to acoustic stimuli and by recording auditory-evoked field potentials
in DCN before and after deafening (Fig.
4a). For recordings in deafened mice we focused exclusively on units that
exhibited both isolated action potentials, known as simple spikes, and brief,
high-frequency bursts of action potentials, known as complex spikes (Fig. 4b, green boxes). Such
complex-spiking units correspond to a class of DCN interneuron known as
cartwheel cells (CWCs) that share numerous similarities with Purkinje cells in
the cerebellum (Supplementary
Fig. 1)[24, 25]. CWCs lack auditory nerve
input, receive massive input from parallel fibers, and inhibit fusiform cells.
Our reasons for focusing on CWCs were twofold: (1) the complex spike is a
distinctive electrophysiological signature of CWCs, which allowed us to be
confident that we were recording in the DCN even in the absence of sound-evoked
responses in the deafened mice and (2) CWCs provide a convenient readout of
non-auditory inputs conveyed by granule cells. Granule cells themselves are too
small to be reliably isolated using conventional extracellular recording
techniques.In deafened mice, 9 of 11 complex-spiking units exhibited significant
simple and/or complex spike firing rate modulations related to licking (Fig. 4c–e, green
traces). The overall rate of complex spike firing also increased
slightly during licking (Fig. 4f). These
results indicate that DCN receives non-auditory information related to licking
behavior. Granule cells provide the main excitatory input to CWCs. Hence the
non-auditory, licking-related responses we observed in CWCs are likely due to
signals conveyed by parallel fibers.We also recorded from complex-spiking units in hearing mice. Most
complex-spiking units exhibited simple and complex spike firing rate modulations
related both to licking (Fig. 4g) and to
presentation of the mimic when the mouse was still (Fig. 4h). Prominent responses to both licking and to
the mimic (Fig. 4i) are consistent with the
notion that CWCs receive both non-auditory and auditory signals conveyed by
granule cells. This is consistent with anatomical evidence for prominent
non-auditory as well as auditory input to GCDs[6] and previous electrophysiological
evidence for prominent auditory responses in complex-spiking units in awake
mice[31].
A role for the spinal trigeminal nucleus in cancelling self-generated
sounds
Based on previous microstimulation and anatomical tracing studies, the
spinal trigeminal nucleus (Sp5) is expected to be the major source of mossy
fiber input to DCN conveying somatosensory information related to licking
behavior[12,32, 33]. As expected from past studies in other mammals, injection
of an anterograde viral tracer (AAV2-GFP) into mouseSp5 resulted in labeled
mossy fibers in the granule cell domains (GCDs) of DCN (n
= 3; Fig. 5a,
arrowheads) as well as in the cerebellum (data not shown).
If non-auditory, licking related inputs from Sp5 serve to cancel out responses
to self-generated acoustic stimuli in DCN, transiently silencing such inputs
should reveal prominent licking-related responses in DCN neurons. Indeed,
micropressure injection of the action potential blocker lidocaine into Sp5 led
to an increase in overall firing in putative DCN output cells during licking as
well as an increased modulation of firing (Fig.
5b,d
red) that tracked the amplitude of the licking sound (Fig. 5b, black lines). No
such changes were observed after saline injection (Fig. 5c,d, purple). Furthermore, increases in
licking responses after lidocaine injection cannot be explained by differences
in sensitivity to acoustic stimuli between lidocaine and saline groups (Fig. 5e), changes in licking rate after
lidocaine injection (Fig. 5f), or changes
in the amplitude of licking sounds after lidocaine injection (Fig. 5g).
Figure 5
A role for the spinal trigeminal nucleus in cancelling self-generated sounds
in DCN
(a) Labeled mossy fibers were observed in DCN granule cell domains
(GCD) after injection of an anterograde viral tracer (AAV2-GFP) into the
ipsilateral Sp5. Scale bars: 200 μm. Right, higher
magnification views of areas indicated by dotted rectangle. Scale bars: 100
μm. White arrowheads indicate labeled mossy fibers in GCDs.
(b,c) Lick-triggered response of DCN cells before
(left) and after (right) injection of
lidocaine (b, n = 10) or saline
(c,
n
= 8) into Sp5. Thin lines are s.e.m. Solid black lines show
the RMS amplitude of the licking sound (scale bars: 1 a.u.). (d)
Lidocaine injection resulted in a significant increase in z-scored lick
responses in DCN units (P = 0.0098, Wilcoxon Signed
Rank Test, red) while no significant increases in z-scored lick
responses occurred after saline injection (P = 0.31,
Wilcoxon Signed Rank Test, purple). (e) Auditory
responses to the mimic were not significantly different in lidocaine and saline
groups (P = 0.87, Wilcoxon Rank Sum Test).
(f) Lick rate did not differ before and after injection of
lidocaine (P = 0.77, Wilcoxon Signed Rank Test) or
saline (P = 0.25, Wilcoxon Signed Rank Test).
(g) Changes in z-score lick responses were not correlated with
changes in the maximum RMS of the licking sound after lidocaine injection
(red, P = 0.36, linear regression
t-test). Changes in the maximum RMS of the licking sound did not differ between
lidocaine and saline groups (P = 0.51, Wilcoxon Rank
Sum Test).
Adaptive cancellation of sounds correlated with behavior in DCN
neurons
Studies of cerebellum-like structures in fish have shown that
cancellation of self-generated inputs is not fixed but reflects an adaptive
filtering process in which anti-Hebbian synaptic plasticity reduces correlations
between principal cell activity and behavior-related signals conveyed by granule
cells[14, 15]. Similar anti-Hebbian plasticity rules
have been described at granule cell synapses in DCN[7-9]. Adaptive filtering in DCN would explain both how diverse
sources of mossy fiber input are sculpted into patterns of synaptic input that
selectively cancel responses to self-generated sounds and how such patterns are
updated if the auditory consequences of a given behavior change. To test whether
DCN is capable of adaptive filtering we delivered an external sound (broadband
or bandpassed noise 5–15 kHz) temporally correlated with licking (30 ms
after tongue contact). The conditions were the same as those for the experiments
shown in Figure 3, except that many more
sound presentations were used. Recordings were made using both glass
microelectrodes and multi-site silicon probe electrodes (Supplementary Fig. 5). Use of the
latter aided the maintenance of single-unit isolation through long bouts of
licking. Responses of putative DCN output cells to the correlated sound declined
over the course of several minutes of pairing (>1000 paired lick-sound
presentations) (Fig. 6a–d,h,
red lines). Such declines were not due to overall changes
in firing rate, but rather were specific to the period of the noise-evoked
response (Fig. 6a–d,h,
black lines). Decreases in DCN responses to sounds
correlated with licking are unlikely to reflect adaptation of peripheral
auditory input as they were not observed in a separate group of DCN units in
which an identical external sound was presented at the same rate but at random
times relative to lick contact (Fig.
6e,f,h, yellow lines). Furthermore, no changes in
sound-evoked responses were observed in VCN units when the external sound was
temporally correlated with licking (Fig.
6g,h, blue lines). The magnitude of the reductions
in response to acoustic stimuli correlated with licking varied substantially
across DCN units (Fig. 6i). We found no
relationships between the magnitude of such reductions and a number of
behavioral and neural parameters, including licking rate, licking variability,
baseline firing rate, and the initial magnitude of noise-evoked responses (Supplementary Fig. 6).
More definitive criteria for identifying DCN cell types, such as juxtacellular
labeling and antidromic stimulation, along with a thorough characterization of
auditory response properties may, in future, provide insights into the source of
this variation. Overall, these results are consistent with a plastic
cancellation or adaptive filtering of self-generated stimuli in DCN similar to
that described previously in cerebellum-like structures in fish.
Figure 6
Adaptive cancellation of sounds correlated with behavior in DCN
(a–d)
Top, response of an example DCN unit to an acoustic stimulus
(broadband or bandpassed filtered noise 5–15 kHz) presented correlated
with lick onset. Lighter traces show responses to later licks, averaged in bins
of 150 licks. Bottom, final response of this cell minus initial
response to the sound plus lick. Thin lines are s.e.m. Left gray area shows the
stimulus presentation period. Left dashed line shows the time of lick onset.
Right dashed line and gray box show the mean and standard deviation,
respectively, of the time of the next lick. (e, f) Same display for
two DCN units for which the acoustic stimulus was played uncorrelated with the
onset time of a lick. (g) Same display for an example VCN unit in
which the acoustic stimulus was presented correlated with the onset time of a
lick. (h) Group data showing average changes in noise-evoked
responses over the course of repeated stimulus presentations. For DCN correlated
units (red) the best fit decay rate was 0.0225 per 100 licks
(n = 20, P = 6 x
10−15, linear regression t-test), for DCN uncorrelated
units (yellow) the best fit decay rate was 0.001 but was not
significantly different from 0 (n = 11,
P = 0.42, linear regression t-test), and for VCN
units (blue) the best fit decay rate was 0.001 but was not
significantly different from 0 (n = 7,
P = 0.56, linear regression t-test). Error bars are
s.e.m. (i) Scatter plot of decay rates of best-fit exponentials fit
separately for every unit. Horizontal black lines show the median value for each
group. Open symbols correspond to the units used as examples in panels
a–g.
Discussion
Distinguishing between external and self-generated sensory stimuli is
fundamental for perception, and is thought to involve a comparison between external
sensory input and internal reference signals related to the animal’s own
behavior, for example motor corollary discharge or proprioception[34]. The present study provides
evidence that such a comparison takes place at the first central stage of mammalian
auditory processing in the DCN. More specifically, our results suggest a scheme
similar to that already well-established in cerebellum-like structures in fish, in
which behavior-related signals conveyed by a mossy fiber-granule cell-parallel fiber
system cancel out responses to self-generated sensory stimuli in principal
neurons[14, 15]. Several independent lines of evidence from
the present study support such a function for DCN. First, responses to sounds
generated by licking are substantially weaker in DCN compared to VCN and such
differences are not accounted for by weaker responses to external acoustic stimuli
in DCN or by an overall suppression of DCN responses during licking. Second,
non-auditory responses to licking behavior are observed in putative CWCs, presumably
due to non-auditory signals conveyed by mossy fibers and granule cells. Third,
inactivation of Sp5, a prominent source of somatosensory mossy fiber input to DCN,
revealed responses to self-generated sounds in DCN units that resembled those
observed in VCN units, suggesting that such input normally functions to cancel DCN
responses to self-generated sounds. Finally, repeated pairing of acoustic stimuli
with licking resulted in a gradual reduction of DCN responses to the paired
stimulus. Importantly, such reductions were not observed when stimuli were presented
at the same rate but uncorrelated with the time of lick contact.Cancellation of self-generated sounds at an early processing stage could
provide mammals with a dedicated channel through which salient or unexpected
auditory signals can rapidly guide motor output, such as escape or orienting
behavior. This interpretation is consistent with effects of DCN lesions, which
disrupt orienting towards but not discriminating between sound source
locations[35, 36] and the fact that, in addition to projecting
to the inferior colliculus, DCN projects directly to auditory thalamus[37], auditory cortex[38], and regions involved in the
acoustic startle response[39]. To
our knowledge, responses to self-generated sounds have not been studied at the level
of the inferior colliculus. Based on the present results, we would predict that a
subset of inferior colliculus neurons selectively encodes external sounds and that
this subset receives its dominant input from DCN rather than VCN. Though we focused
on a single behavior and a single source of mossy fiber input, the fact that DCN
receives mossy fiber inputs from numerous brains regions conveying a wide range of
sensory and motor signals implies a much broader capacity for canceling predictable
auditory input[6]. We also note that
our results by no means rule out the possibility that additional sources of mossy
fiber inputs (besides those originating from Sp5) play a role in cancelling
self-generated sounds caused by licking behavior. For example, mossy fiber input to
DCN granule cell domains originating from the pontine nuclei could provide
motor-related signals relevant for cancelling the auditory consequences of the
animal’s own movements, including licking[40].Though our results suggest that the integration of non-auditory and auditory
inputs to DCN serves to cancels responses to self-generated stimuli, they do not
rule out other functions for multimodal integration in DCN. Numerous lines of
evidence suggest that the DCN plays an important role in processing spectral cues
for sound localization[6, 10, 35].
A recent study provided evidence that the integration of auditory and vestibular
information in DCN could aid in distinguishing changes in auditory input due to
motion of an external sound source from those due to self-motion[13]. Specifically, Wigderson et al. demonstrate
that vestibular and auditory inputs are combined nonlinearly in putative DCN output
cells. This is a different mode of integration from that suggested here and by
studies of other cerebellum-like structures in fish in which behavior-related
signals conveyed by mossy fibers are used to subtract out self-generated signals.
Since vestibular inputs would not have been engaged during the head-fixed licking
behavior we studied, no direct comparison between the two studies is possible.
However, determining whether different sources of mossy fiber inputs to DCN, e,g.
vestibular versus somatosensory, perform similar or different computations is an
important question for future studies.Key questions remain regarding the circuit mechanisms underlying the
cancellation of self-generated sounds in DCN reported here. In cerebellum-like
structures in fish cancellation is due to the generation and subtraction of negative
images of the responses of principal cells to self-generated inputs. Such negative
images are formed by anti-Hebbian synaptic plasticity acting on corollary discharge,
proprioceptive, and electrosensory signals conveyed by parallel fibers[14, 15]. Due both to limits on data collection imposed by satiation
as well as the technical difficulty of maintaining stable single-unit recordings in
brainstem through long bouts of licking we focused exclusively on providing evidence
for cancellation. A crucial next step will be to determine whether cancellation of
self-generated sounds in DCN is due to the generation of negative images.
Furthermore, genetic tools available in mice should make it possible to perform a
detailed dissection of the mechanisms underlying the cancellation of self-generated
sounds in DCN. Key questions include the functional roles of specific cell types,
such as the CWCs, and the roles of specific sites and mechanisms of plasticity, such
as spike timing-dependent plasticity at parallel fiber synapses onto fusiform cells
and CWCs described in vitro[7, 8, 41].Finally, our results are intriguing from evolutionary and comparative
perspectives. The brains of most vertebrates contain both a cerebellum and one or
more sensory structures with circuitry closely resembling that of the cerebellum
[2, 6, 15]. Though
similarities between different cerebellum-like structures and the cerebellum are
well-established in terms of their evolution, development, gene expression patterns,
circuitry and synaptic plasticity, the question of whether they perform similar
functions has been more difficult to address. Cerebellum-like structures associated
with electrosensory processing in three distinct groups of fish have been shown to
act as adaptive filters[14, 15] and numerous lines of evidence
also exist supporting such a role for the mammalian cerebellum[42, 43].
In both cases granule cells convey a rich variety of signals[44-48] and a separate, non-plastic input (peripheral sensory input in
the case of cerebellum-like structures and climbing fiber input in the case of
cerebellum) instructs plasticity at granule cell synapses such that output that is
predictable (in the case of cerebellum-like sensory structures) or associated with
errors in motor performance (in the case of the cerebellum) is gradually reduced.
Interestingly, adaptive cancellation of self-generated vestibular inputs has been
demonstrated in neurons of the fastigial nucleus and vestibular nucleus in
primates[49, 50]. Hence evidence provided here for sensory
cancellation and adaptive filtering in DCN suggests that a core function may be
shared by cerebellum-like structures and the cerebellum across vertebrate
phylogeny.
Methods
All experimental protocols were approved by the Columbia University
Institutional Animal Care and Use Committee. Adult male wild-type mice
(129S6/SvEvTac) were used for all experiments. Mice were purchased from Taconic
Biosciences (Hudson, NY) and housed in an on-site animal facility on a 12 hour
light-dark cycle. Most experiments were performed during the light cycle. Data
collection and analysis were not performed blind to the conditions of the
experiments. All relevant data from this study are available from the authors upon
reasonable request.
Surgery
Mice were anesthetized with isoflurane (1.5–2%) and
placed in a stereotax equipped with zygomatic ear bars (Kopf Instruments). The
skull was exposed and a small craniotomy 200–500 μm in diameter
was made over the right dorsal cochlear nucleus (5.5 mm posterior to bregma and
2.3 mm lateral to the midline). The craniotomy was covered with silicon
elastomer (Kwik-Sil, WPI, Sarasota, FL). A custom headplate was attached to the
skull using dental cement (C&B Meta-bond, Parkell, Edgewood, NY). Mice were
allowed to recover for 3 days prior to the start of experiments.
Experimental apparatus and auditory stimulus presentation
All mouse behavior and neurophysiology experiments were performed in a
double walled sound-attenuating chamber (Double Deluxe Model, Gretchken
Industries). The ambient noise within the chamber was <30 dB SPL as measured
by a sound pressure level meter (Bruel and Kjaer Type 2240). A custom head
fixation device was used to secure the animal via two attachment points to a
stainless steel headplate and allowed for consistent positioning across multiple
recording sessions. The animal’s body was additionally secured between
two pieces of styrofoam molded to its body. A stainless steel lick spout was
positioned in front of the animal’s mouth and licks were detected using
standard methods. Acoustic stimuli were generated using Spike2 software
(Cambridge Electronic Design) and delivered using an electrostatic speaker (ES-1
Tucker Davis Technologies) positioned approximately 10 cm in front of the mouse
just to the right of the midline. Sound pressure levels of acoustic stimuli as
measured in dB SPL were calibrated to the location of the animal’s right
ear. The frequency response of the sound system was measured to be flat
(+/− 4dB) from 1 kHz to 50 kHz using a ¼″
condenser microphone (377C01, PCP piezotronics), attached to a preamplifier
(426B03, PCP piezotronics) positioned at the location of the mouse’s
right ear. Sounds caused by licking were monitored by a small electret
microphone (Knowles model 23329N) placed just above the lick spout. Microphone
signals were sampled at 100 kHz and digitized using an analog to digital
converter (Power 1401, Cambridge Electronic Design).The lick mimic was constructed from segments of microphone recordings 50
ms before tongue contact to 150 ms after tongue contact, and bandpass filtered
between 1 and 50 kHz (n = 5 mice). We transformed each
segment to a spectrogram using a short-time Fourier transform (Hamming window
with a width of 10.24 ms and a stride of 5.12 ms). We then constructed the mimic
by performing principal component analysis on this set of lick-triggered
spectrograms and making a weighted sum of the first five principal components.
This resulted in a mimic spectrogram, which we used as a spectro-temporal filter
to convolve with a random signal. This resulted in a stimulus (the lick mimic)
which contained the most prominent spectro-temporal features of the licking
sound (including distinct spectral peaks at 2, 8, and 30 kHz) with little power
elsewhere. Due to issues such as bone conduction we could not measure the exact
loudness of natural licking sounds. The lick mimic was replayed at a loudness
that evoked a response in VCN units that was, on average, similar to that evoked
by licking. This same loudness (12 dB SPL) was used subsequently for all
experiments involving the mimic.
Behavioral training
Mice were allowed to recover 3 days after surgery before to beginning
water deprivation and habituation to head restraint in the experimental
apparatus. Weight was monitored daily and additional water was given in the home
cage if the animal’s weight fell below 80% of its initial
pre-surgical weight. Extracellular recordings from DCN and VCN units were then
performed during daily sessions lasting 2–3 hours. Mice licked roughly
3,000 times per session.
Extracellular recording and identification of DCN and VCN neurons
Standard procedures were used for extracellular recording using glass
microelectrodes (5–20 MΩ resistance). Pipettes with a long taper
were used to avoid tissue damage. On the day of recording, mice were placed into
the head restraint and the silicone elastomer was removed and 0.9%
saline was placed over the exposed craniotomy. The microelectrode was lowered
into the craniotomy vertically. As the electrode was advanced through the
cerebellum a series of 200 ms long search tones from 5 kHz to 50 kHz (in 5 kHz
steps) were delivered. Entrance into DCN was marked by a transient increase in
electrode resistance along with the sudden appearance of tone-evoked multi-unit
activity which occurred ~2700–3200 μm below the surface of the
cerebellum. The microelectrode was then advanced in 1 μm steps until a
unit was isolated. Complex-spiking units were the first units encountered on an
electrode penetration through DCN and could be unambiguously identified based on
their distinctive complex spikes. Complex spikes are stereotyped, high-frequency
action potential bursts superimposed on a slower depolarization and are not
observed in any DCN cell types except CWCs[24, 25]. Similar to
previous in vivo extracellular recording studies of DCN in a
variety of species, including mouse[18], we defined complex spikes as high-frequency bursts
(ISIs < 3.5 ms) of 2–5 action potentials. Complex spikes were
identified automatically in Spike2 using custom written scripts and then
confirmed individually. Within such bursts, action potentials successively
widened and decreased in amplitude (Fig.
4b). Complex-spiking units were isolated 50–200 μm from
the surface of the DCN. DCN units lacking complex spikes, referred to here as
simple spiking units, were isolated 100–300 μm from the surface
of the DCN. Complex-spiking units were never found ventral to simple spiking
units on the same electrode penetration consistent with the known
cytoarchitecture of the DCN. Passage from DCN into VCN was determined by
monitoring the tone frequency that most strongly drove multi-unit activity for
each 50 μm advance of the electrode. As the electrode advanced
ventrally, the best frequency for driving multi-unit activity progressively
decreased. A sudden increase in the best frequency (generally from ~5 kHz to ~20
kHz and usually occurring 500–600 μm below the surface of DCN)
signified entrance into the VCN. Units which were isolated at least 100
μm ventral to the best frequency reversal (~800–1000 μm
below the surface of the DCN) and which showed clear tone-evoked responses were
classified as VCN units. Units isolated less than 100 μm from the best
frequency reversal were not included in the analysis. Histological verification
of DCN and VCN recording sites was performed by iontophoresis of dextran
conjugated Alexa Flour 594 (D22913, Thermo Fisher Scientific) at recording sites
between 100 and 300 μm below the surface of DCN (depths at which most
DCN simple spiking units were isolated) and at 900 μm (the depth at
which most VCN units were isolated). Only units that remained well-isolated
through at least 75 licks were included in the analysis. Sounds associated with
licking contain most power between 2–15 kHz, which corresponds to the
lower portion of the mouse hearing range. For this reason we focused our
recordings on regions of the cochlear nucleus that represent these frequencies.
A subset of the recordings in Figure 6 (DCN
correlated, n = 10/20; DCN uncorrelated,
n = 5/11; VCN correlated, n
= 3/7) were performed using a 16 channel silicon probe (Neuronexus,
A1x16–5mm-25–177-A16). Silicon probe recordings proved superior
to glass microelectrode recordings in terms of their stability during licking
behavior. Probes consisted of a vertical linear array of 15 micron diameter
electrode sites spaced 25 microns apart. Impedances ranged from ~2–6
kOhms. Recording tracks were made in DCN or VCN until a well-isolated single
unit emerged on at least one electrode site. Most sites exhibited only
multi-unit activity and were not analyzed. The same electrophysiological
signatures described above were used to identify the dorsal and ventral cochlear
nuclei. Rank sum tests revealed no difference between probe and glass recordings
in the median decay rate of cells in all three groups shown in Figure 6 (DCN correlated: P =
0.09, DCN uncorrelated: P = 0.79, VCN:
P = 0.63).
Viral Injections
A nanoliter injector (504126, WPI instruments) was used to inject
adeno-associated virus expressing green fluorescent protein. The pipette was
positioned over the coordinates 7.2 mm posterior to bregma and 1.8 mm right of
the midline and lowered until the tip touched the surface of the cerebellum. The
pipette was then lowered 3.5 mm below the surface of the cerebellum to the base
of the spinal trigeminal nucleus. 27 nL of the virus was injected in three 9 nL
pulses. Virus was also injected at depths of 3.2, 2.9, and 2.7 mm below the
surface of the cerebellum. The pipette was then slowly raised out of the
cerebellum and the incision was closed using cyanoacrylate glue (Vetbond, 3M,
Maplewood Minnesota). Two weeks after surgery, mice were anesthetized with
ketamine/xylazine and perfused with 4% formaldehyde. The brains were
dissected from the skull and allowed to post-fix in 4% formaldehyde
overnight. They were then cryoprotected in a 30% sucrose solution and
sectioned on a cryostat. Sections were then mounted on glass slides (Superfrost,
Fisher Scientific, Waltham, MA), counterstained with DAPI, and imaged on a
confocal microscope (Carl Zeiss Microscopy, Peabody, MA).
Deafening
Mice were deafened bilaterally. Surgery for deafening mice was performed
using 2–4% isoflurane. An incision was made just posterior to
the tragus and extended ventrally. The tympanum, malleus, and incus were
visualized through the auditory meatus. Using fine forceps the tympanum was
ruptured and the malleus and incus were removed. The stapes was removed exposing
the oval window with care taken not to damage the stapedial artery. Using a 30
gauge needle, approximately 10–20 μL of 1.0 mg/mL kanamycin was
injected through the oval window and into the cochlea. The middle ear was packed
with gel foam and the mouse was allowed to recover in its home cage. Deafening
was verified by lack of observable behavioral responses to acoustic stimuli and
by recording sound evoked field potentials to broadband noise (50 ms,
6–90 dB SPL) in DCN ~75 μm below the first observed
complex-spiking unit. This was done both before and 2 days after surgical
deafening in each mouse. DCN recordings were performed 2–4 days after
surgery. Recording locations within DCN were confirmed histologically using
iontophoresis of dextran-conjugated Alexa 594 as described above.
Lidocaine injections into Sp5
A small craniotomy (~300 μm diameter) was made prior to
attachment of the headplate at coordinates 7.2 mm posterior to bregma and 1.8 mm
lateral to the midline and covered with silicon elastomer. On the day of the
experiment, a glass micropipette with a long taper was pulled using a pipette
puller (PC-10, Narishige Group) and manually broken to 3.5 um diameter under a
microscope. The pipette was then filled with 2% lidocaine in
0.9% saline with care taken to avoid air bubbles in the tip. The pipette
was then coupled to a micropressure injector (Pikospritzer MK III, Parker
Instrumentation) and successful ejection of lidocaine was confirmed visually to
ensure tip was not clogged. The lidocaine pipette was advanced into Sp5 at an
angle of 12.8 degrees. For Sp5 inactivation DCN unit responses were recorded for
~200 licks before ~100 nL of lidocaine was injected in a single pulse. Location
of the lidocaine pipette within DCN was verified histologically using
iontophoresis of dextran-conjugated Alexa 594 as described above.
Lick-sound pairing
After isolation of a unit, access to water was given and contact to the
lick spout by the animal’s tongue was paired with a 30 ms noise
(15–71 dB SPL, broadband or bandpassed filtered 5–15 kHz). In
the correlated condition the noise was presented 30 ms after contact with the
lick spout. The pairing was conducted continuously until the animal stopped
licking or unit isolation was lost. In the uncorrelated condition presentation
of the noise during licking was unrelated to the tongue’s contact with
the spout and was instead presented at random intervals of 120–160 ms.
Since these intervals are similar to inter-lick intervals the overall rate of
sound presentations was similar in the correlated and uncorrelated conditions.
Correlated versus uncorrelated conditions were tested in the same mice on
alternating sessions. The condition to be tested during a given session was
pre-determined prior to isolating a unit.
Data analysis and statistics
All analyses were performed using custom written scripts for Matlab
(Mathworks, Natick, MA) and Spike 2. No statistical methods were used to
predetermine sample sizes. Comparisons between two groups were made by
Mann–Whitney U-test or Wilcoxon signed rank test for
paired groups. Tests of the significance of linear regression slopes used a
linear regression t-test. For the linear regression t-test residuals were
assumed to be normally distributed but this was not formally tested. Differences
were considered statistically significant at P < 0.01. Data
are presented as mean ± s.e.m. unless indicated otherwise.
Lick sound spectrograms
To compute the average spectrogram of the sound associated with
licking we first bandpass filtered raw microphone traces removing
frequencies below 1 kHz and above 50 kHz (the highest frequency that could
be detected by our equipment). 300 ms segments of the filtered microphone
recording centered on the onset of each lick were transformed with a short
time Fourier transform (Hamming window with a width of 10.24 ms and a stride
of 5.12 ms) to obtain a set of lick-centered spectrograms. These were
averaged to obtain a lick-triggered average spectrogram. Time-frequency
peaks were found by first applying a 2-D median filter (widths 290 Hz, 3 ms)
to individual spectrograms and then convolving with a 2-D Gaussian kernel
with widths 1.5 kHz and 20 ms. We then calculated local time-frequency
maximums by finding local maximums of the filtered spectrograms.
RMS amplitude of microphone traces
To compute the RMS amplitude of the sound associated with licking
microphone recordings were first bandpass filtered (1–50 kHz). We
then computed the RMS amplitude of this filtered microphone trace by
convolving the squared trace with a moving average kernel of width 1 ms and
taking the square root of the result. These recordings were then aligned to
the time of tongue contact with the lick spout and averaged across
licks.
Average and Z-scored electrophysiological responses during licking and
mimic presentation
To compute average responses to licking or during delivery of the
mimic spike trains were convolved with a normalized sum-of-two-exponentials
kernel, with a rise time of 5 ms and a decay time of 20 ms. Averages were
aligned either on tongue contact with the lick spout or mimic delivery and
average baseline firing was subtracted. Baseline firing rates was taken to
be the average firing rates in periods at least 25 ms before the next lick
or mimic onset and at least 150 ms after the previous lick or mimic onset.
Peak-to-trough firing rates were computed by taking the average licking or
mimic response in a 200 ms window centered on the tongue-to-spout contact or
mimic onset and determining the difference in the maximum to minimum firing
rates. To compute z-scores we first took the maximum of the average licking
or mimic response in a 200 ms window centered on tongue-to-spout contract or
mimic onset. We then created shuffled spike trains of approximately the same
length as the original spike train by randomly sampling from the
inter-spike-interval distribution of the real spike train. Each shuffled
spike train was convolved with the same kernel as the real spike train, its
lick- or mimic-triggered average computed, and the maximum firing rate of
this triggered average taken in the same 200 ms window. This was repeated
500 times and the maximum of the triggered average of the real spike train
was expressed in units of the standard deviation from the mean of the
shuffle distribution, i.e. z-scored based on the shuffle distribution. We
determined the significance of neural responses by computing approximate
p-values for the recorded maximum lick-triggered rate, which were estimated
by the fraction of shuffled-spike trains showing maximum lick-triggered
responses greater than that of the real spike train.
Correlated and uncorrelated sound-lick pairings
The noise-evoked response is defined in bins of 150 stimulus
presentations. For each 150 presentations the response is defined as the
maximum of the average noise-evoked response during that stimulus period
minus the baseline rate during that period. For each unit the response is
normalized to equal one in the first bin. We performed a linear regression
between the stimulus bin and the log of the noise-evoked responses for each
population, in order to extract a decay rate for each population.
Authors: Kendra L Marks; David T Martel; Calvin Wu; Gregory J Basura; Larry E Roberts; Kara C Schvartz-Leyzac; Susan E Shore Journal: Sci Transl Med Date: 2018-01-03 Impact factor: 17.956