Literature DB >> 28530663

A cerebellum-like circuit in the auditory system cancels responses to self-generated sounds.

Shobhit Singla¹, Conor Dempsey¹, Richard Warren¹, Armen G Enikolopov², Nathaniel B Sawtell¹.

Abstract

The dorsal cochlear nucleus (DCN) integrates auditory nerve input with a diverse array of sensory and motor signals processed in circuitry similar to that of the cerebellum. Yet how the DCN contributes to early auditory processing has been a longstanding puzzle. Using electrophysiological recordings in mice during licking behavior, we show that DCN neurons are largely unaffected by self-generated sounds while remaining sensitive to external acoustic stimuli. Recordings in deafened mice, together with neural activity manipulations, indicate that self-generated sounds are cancelled by non-auditory signals conveyed by mossy fibers. In addition, DCN neurons exhibit gradual reductions in their responses to acoustic stimuli that are temporally correlated with licking. Together, these findings suggest that DCN may act as an adaptive filter for cancelling self-generated sounds. Adaptive filtering has been established previously for cerebellum-like sensory structures in fish, suggesting a conserved function for such structures across vertebrates.

Entities: Chemical Disease Gene Species

Mesh：

Substances：
Lidocaine

Year: 2017 PMID： 28530663 PMCID： PMC5525154 DOI： 10.1038/nn.4567

Source DB: PubMed Journal: Nat Neurosci ISSN： 1097-6256 Impact factor: 24.884

The first central stage of mammalian auditory processing occurs within the dorsal and ventral divisions of the cochlear nucleus[1]. Based on similarities in their evolution, development, gene expression patterns, and anatomical arrangement, the DCN is considered to belong to a class of so-called cerebellum-like sensory structures[2-6]. Other cerebellum-like structures include the first central stages of electrosensory and mechanosensory lateral line processing in several groups of fish. Numerous cell and fiber types are shared by all of these cerebellum-like structures and the cerebellum itself including: mossy fibers, granule cells, parallel fibers, Golgi cells, molecular layer interneurons, and Purkinje or Purkinje-like cells. A hallmark of the circuitry of cerebellum-like sensory structures is the integration of direct input from peripheral sensory receptors (e.g. electroreceptors in the case of cerebellum-like structures in fish and auditory nerve fibers in the case of DCN) with a diverse array of sensory and motor signals conveyed by a granule cell-parallel fiber system. A primary site of this integration within DCN is the fusiform cell. Fusiform cells are also the major output cell of DCN and project to higher stages of auditory processing such as the inferior colliculus. The basilar dendrites of fusiform cells are contacted by auditory nerve fibers, which form a tonotopic map within the deep layer of DCN (Supplementary Fig. 1)[1, 6]. Their apical dendrites extend into a superficial molecular layer where they are contacted by parallel fibers. Parallel fibers arise from granule cells located in so-called granule cell domains (GCDs) around the margins of the nucleus and cross through different tonotopic regions of DCN[4]. Granule cells receive a wide variety of signals, both auditory and non-auditory, from mossy fibers originating in a number of different brain regions[6]. Parallel fiber, but not auditory nerve fiber synapses, have been shown to exhibit forms of long-term associative synaptic plasticity in vitro[7-9]. Though previous in vivo studies of DCN have extensively characterized auditory response properties in anesthetized or decerebrate animals[10], much less is known about the functional significance of its cerebellum-like circuitry[11-13]. Some of the best clues come from studies of cerebellum-like structures associated with electrosensory processing in fish. Such studies have shown that anti-Hebbian synaptic plasticity acting on proprioceptive, electrosensory, and motor corollary discharge signals conveyed by parallel fibers serve to cancel principal cell responses to self-generated electrosensory inputs, e.g. those arising from the fish’s own movements or electromotor behavior[14, 15]. Cancellation of self-generated electrosensory inputs allows externally-generated, behaviorally relevant stimuli to be processed more effectively. Guided by these results, we set out to test the hypothesis that the cerebellum-like circuitry of the DCN functions to cancel responses to self-generated sounds. To this end we developed a preparation to study neural responses to self-generated sounds in the auditory brainstem of awake, behaving mice. We chose licking behavior because it is stereotyped and repetitive, can be elicited in head-fixed animals during electrophysiological recordings, and, as we demonstrate, generates sounds which are a potential source of interference for the mouse auditory system.

Results

DCN neurons respond preferentially to external versus self-generated sounds

We found that rhythmic licking generates sounds within the hearing range of the mouse and that such sounds exhibit stereotyped spectral and temporal profiles that were similar across mice (Fig. 1a, Supplementary Fig. 2 and Supplementary Video 1). The temporal profile of the licking sound is shown by the root mean squared (RMS) amplitude of the microphone recording aligned to tongue contact with the lick spout (Fig. 1a, white trace). Though the exact physical origin of the licking sounds was not determined, tongue-to-spout contact appears not to be the main cause. As can be seen in both the spectrogram and RMS amplitude trace from the representative mouse shown in Figure 1a, licking sounds typically consist of an early component that begins before contact as well as a larger late component that peaks ~50 ms after contact, during tongue retraction (Fig. 1a and Supplementary Fig. 2).

Figure 1

Self-generated sounds strongly affect VCN but not DCN neurons

(a) Average spectrogram of self-generated sounds during licking for a representative mouse. Arrow and dotted line indicate time of tongue contact with the lick spout. Solid white line indicates the root mean squared (RMS) amplitude of the microphone recording. (b) Left, dextran-conjugated Alexa 594 labeling (green) at recording sites in DCN and VCN (arrowheads). DAPI, red. Right, higher magnification of dashed white box on left showing a labeled fusiform cell (arrowhead). (c) Example ventral cochlear nucleus (VCN) unit response during licking. Arrows and dotted lines indicate times of tongue contact with the lick spout. Traces represent the microphone recording (top), smoothed firing rate (middle), and the VCN unit recording (bottom; scale: 30 μV). (d) Top, average RMS amplitude of the licking sound during VCN unit recordings (scale bar: 1 a.u.). Bottom, average VCN lick-triggered firing rate (n = 21). Thin lines are s.e.m. (e) Example DCN unit response during licking. Scale bar and display same as in c. (f) Top, average RMS amplitude of the licking sound during DCN unit recordings. Bottom, average lick-triggered responses of all DCN units (n = 25), excluding those exhibiting complex-spikes. Compared to VCN units, DCN units exhibited smaller temporal modulations related to licking (peak-to-trough firing rate for VCN: 43.8 ± 26.9 Hz, n = 21; for DCN: 19.7 ± 19.9 Hz, n = 25, mean and S.D., P = 0.0005, Wilcoxon Rank Sum Test). Scale bar and display same as in d. (g) Z-scored lick responses (see Methods) were significantly smaller in DCN compared to VCN units (P = 0.00002, Wilcoxon Rank Sum Test). Median responses are indicated by solid lines.

To determine whether licking sounds evoke neural responses that could interfere with auditory processing and, if so, whether such responses are cancelled out in the DCN, we compared neural activity during licking in well-isolated single-units in the ventral cochlear nucleus (VCN) and DCN. Since VCN receives direct auditory nerve input but lacks cerebellum-like circuitry, we hypothesized that VCN units would respond to acoustic stimuli regardless of whether they are self- or externally-generated. Recording locations were judged based on characteristic reversals of tonotopy at the DCN/VCN border[16, 17] and verified by iontophoresis of a dextran-conjugated fluorescent dye (Fig. 1b, white arrowheads indicate recording sites, Supplementary Fig. 3 and Methods). Though unambiguous criteria for linking physiological response properties with morphological cell classes have not yet been established for the awake mouse DCN[18], several properties of the recorded units indicate that they likely correspond to fusiform cells, including their high spontaneous firing rates and purely excitatory responses to acoustic stimuli (Supplementary Fig. 4 and Methods)[19-23]. Units exhibiting complex spikes, putative cartwheel cell interneurons[24, 25] (Supplementary Fig. 1), were also encountered and analyzed separately. Results for complex-spiking units are reported in Figure 4.

Figure 4

Non-auditory responses related to licking in DCN complex-spiking units

(a) Sound-evoked field potentials (50 ms, broadband noise, averaged over 50 presentations) recorded in DCN of the same mouse before (left) and after (right) surgical deafening. Note complete absence of sound-evoked field potentials after deafening. (b) Example DCN complex-spiking unit recorded during licking in a surgically deafened mouse. Arrows and dotted lines indicate times of tongue contact with the lick spout. Top trace, microphone recording. Below, extracellular voltage from a DCN complex-spiking unit (scale: 30 μV). i, ii, Expanded traces from boxed regions showing complex spike (CS) (shaded rectangle) and simple spike (SS) waveforms. (c,d) Lick-triggered SS and CS firing rates for two complex-spiking units recorded in deafened mice. Thin lines are s.e.m. Gray traces show the average lick-triggered response of shuffled spike trains. Data in c are from same unit as example traces in b. Top trace (black) is the RMS amplitude of the licking sound (scale bar = 1 a.u.). (e) Summary of z-scored lick responses of 11 complex-spiking units recorded in 3 surgically deafened mice. 8 showed significant lick responses in their SS firing and 3 showed significant lick responses in their CS firing (α=0.01, see Methods). Median responses are indicated by solid lines. (f) Overall CS firing rates increased slightly during periods of licking in deafened mice (n = 11, P =0.04, Wilcoxon Signed Rank Test). (g) Lick-triggered SS and CS firing rates for a complex-spiking unit recorded in a hearing mouse. Same display as in c. (h) Mimic-triggered SS and CS firing rates for a complex-spiking unit recorded in a hearing mouse. Same unit as shown in g. (i) Summary of licking and mimic responses in complex-spiking units recorded in hearing mice. Average Z-score responses to licking (n = 23) were 8.2 ± 4.4 for SSs and 3.8 ± 4.4 for CSs. Average Z-score responses to the mimic (n = 12) were 10.1 ± 4.9 for SSs and 3.2 ± 3.3 for CSs.

Consistent with the possibility that licking behavior causes significant self-generated sounds, VCN units exhibited an overall firing rate elevation during licking as well as firing rate modulations (Fig. 1c,d, blue traces) that tracked the RMS amplitude of the licking sound (black traces). In contrast, DCN units exhibited substantially weaker firing rate modulations during licking (Fig. 1e–g, red traces and circles). Though these results are consistent with cancellation of self-generated sounds in DCN, an alternative explanation is that differences between VCN and DCN responses during licking are due to systematic differences in their auditory response properties. We evaluated this possibility in a subset of VCN and DCN units by comparing activity during licking to activity during delivery of an externally-generated acoustic stimulus with temporal and spectral properties that roughly matched the licking sounds recorded across mice (Fig. 2a, Methods). This stimulus is referred to henceforth as the lick mimic and was presented outside of licking bouts, when the mouse was still. Though the match between actual sounds generated by licking and the lick mimic is not expected to be perfect, for example due to issues such as bone conduction, this stimulus nevertheless provided a simple and principled means of comparing auditory responses in VCN and DCN. Strong responses to the lick mimic were observed in both VCN (Fig. 2b,c, blue traces) and DCN units (Fig. 2d,e, red traces). The strength of responses during licking was highly correlated with the strength of responses to the lick mimic in VCN units (Fig. 2f, blue circles). This is exactly what is expected if VCN licking responses are indeed due to self-generated sounds. In contrast, there was no significant correlation between mimic and licking responses in DCN units (Fig. 2f, red circles), such that even units with strong responses to the lick mimic failed to respond during licking. These observations suggest that weaker responses to licking in DCN compared to VCN cannot be explained by differences in auditory sensitivity between the two regions. What then is the mechanism underlying the apparent reduction of responses to self-generated sounds during licking behavior in DCN?

Figure 2

Responses to self-generated versus external sounds in VCN and DCN

(a) Spectrogram of the lick mimic generated from microphone recordings from 5 mice (Methods). Overlaid white line represents the RMS amplitude. (b) Example VCN unit response to the lick mimic. Same unit as in Fig. 1c. Traces represent a schematic of the RMS of the mimic (top), smoothed firing rate (middle), and the VCN unit recording (bottom; scale bar: 30 μV). (c), Top, schematic of the RMS of the lick mimic. Bottom, average VCN unit response to the lick mimic (n = 6). Thin lines are s.e.m. The lick mimic was delivered at 12 dB SPL in all experiments. (d, e) Same scale bar and display as b, c but for DCN unit responses to the mimic (n = 13). Traces in d are from same unit shown in Fig. 1e. VCN and DCN unit responses to the mimic were not significantly different (P = 0.32, Wilcoxon Rank Sum Test). (f) Responses to licking were highly correlated to those observed in the same units to the lick mimic for VCN (n = 6, P < 0.001, r = 0.95, linear regression t-test) but not DCN recordings (n = 13, P = 0.79, r = 0.0007, linear regression t-test).

One possibility is that the overall sensitivity of DCN units to sound is reduced during licking behavior. Indeed, an overall suppression of auditory responsiveness during behavior has been reported in a variety of systems[26, 27], including the mouse auditory cortex[28, 29]. To test this, we compared DCN unit responses to an externally-generated acoustic stimulus (bandpassed noise 5–15 kHz, 15dB SPL) delivered either during licking (Fig. 3a, lick and noise) or when the mouse was still (Fig. 3a, noise alone). Responses to the acoustic stimulus were indistinguishable under the two conditions (Fig. 3a,b). In addition, overall firing rates in DCN units were similar when mice were licking versus still (Fig. 3c). Together, these results are inconsistent with an overall suppression of auditory sensitivity in DCN during licking and point instead to a mechanism for selectively canceling self-generated sounds.

Figure 3

DCN responses to acoustic stimuli are not suppressed during licking

(a) Example DCN unit response to an acoustic stimulus (bandpassed filtered 5–15 kHz, 15 dB SPL) played while the mouse was still versus during licking. Gray bar indicates stimulus duration. (b) No differences in responses were observed when the mouse was still versus licking (n = 9, P = 0.49, Wilcoxon Signed Rank Test). (c) Overall firing rates for DCN units were similar when the mouse was licking versus still.

Non-auditory signals related to licking revealed in DCN of deafened mice

In addition to auditory nerve input, DCN receives non-auditory, behavior-related signals conveyed by mossy fibers. Previous studies of cerebellum-like structures in fish have shown that behavior-related signals conveyed by mossy fibers serve to selectively cancel self-generated electrosensory input[14, 15]. Though electrophysiological correlates of non-auditory mossy fiber inputs to DCN have been characterized in anesthetized or decerebrate preparations, e.g. using electrical stimulation of somatosensory brain regions projecting to DCN[12, 13, 30], responses to non-auditory inputs have not yet been demonstrated in awake, behaving animals. To isolate non-auditory responses related to licking behavior we recorded from DCN in deafened mice (n = 3). Deafening (see Methods) was confirmed by a lack of observable behavioral responses to acoustic stimuli and by recording auditory-evoked field potentials in DCN before and after deafening (Fig. 4a). For recordings in deafened mice we focused exclusively on units that exhibited both isolated action potentials, known as simple spikes, and brief, high-frequency bursts of action potentials, known as complex spikes (Fig. 4b, green boxes). Such complex-spiking units correspond to a class of DCN interneuron known as cartwheel cells (CWCs) that share numerous similarities with Purkinje cells in the cerebellum (Supplementary Fig. 1)[24, 25]. CWCs lack auditory nerve input, receive massive input from parallel fibers, and inhibit fusiform cells. Our reasons for focusing on CWCs were twofold: (1) the complex spike is a distinctive electrophysiological signature of CWCs, which allowed us to be confident that we were recording in the DCN even in the absence of sound-evoked responses in the deafened mice and (2) CWCs provide a convenient readout of non-auditory inputs conveyed by granule cells. Granule cells themselves are too small to be reliably isolated using conventional extracellular recording techniques. In deafened mice, 9 of 11 complex-spiking units exhibited significant simple and/or complex spike firing rate modulations related to licking (Fig. 4c–e, green traces). The overall rate of complex spike firing also increased slightly during licking (Fig. 4f). These results indicate that DCN receives non-auditory information related to licking behavior. Granule cells provide the main excitatory input to CWCs. Hence the non-auditory, licking-related responses we observed in CWCs are likely due to signals conveyed by parallel fibers. We also recorded from complex-spiking units in hearing mice. Most complex-spiking units exhibited simple and complex spike firing rate modulations related both to licking (Fig. 4g) and to presentation of the mimic when the mouse was still (Fig. 4h). Prominent responses to both licking and to the mimic (Fig. 4i) are consistent with the notion that CWCs receive both non-auditory and auditory signals conveyed by granule cells. This is consistent with anatomical evidence for prominent non-auditory as well as auditory input to GCDs[6] and previous electrophysiological evidence for prominent auditory responses in complex-spiking units in awake mice[31].

A role for the spinal trigeminal nucleus in cancelling self-generated sounds

Based on previous microstimulation and anatomical tracing studies, the spinal trigeminal nucleus (Sp5) is expected to be the major source of mossy fiber input to DCN conveying somatosensory information related to licking behavior[12,32, 33]. As expected from past studies in other mammals, injection of an anterograde viral tracer (AAV2-GFP) into mouse Sp5 resulted in labeled mossy fibers in the granule cell domains (GCDs) of DCN (n = 3; Fig. 5a, arrowheads) as well as in the cerebellum (data not shown). If non-auditory, licking related inputs from Sp5 serve to cancel out responses to self-generated acoustic stimuli in DCN, transiently silencing such inputs should reveal prominent licking-related responses in DCN neurons. Indeed, micropressure injection of the action potential blocker lidocaine into Sp5 led to an increase in overall firing in putative DCN output cells during licking as well as an increased modulation of firing (Fig. 5b,d red) that tracked the amplitude of the licking sound (Fig. 5b, black lines). No such changes were observed after saline injection (Fig. 5c,d, purple). Furthermore, increases in licking responses after lidocaine injection cannot be explained by differences in sensitivity to acoustic stimuli between lidocaine and saline groups (Fig. 5e), changes in licking rate after lidocaine injection (Fig. 5f), or changes in the amplitude of licking sounds after lidocaine injection (Fig. 5g).

Figure 5

A role for the spinal trigeminal nucleus in cancelling self-generated sounds in DCN

(a) Labeled mossy fibers were observed in DCN granule cell domains (GCD) after injection of an anterograde viral tracer (AAV2-GFP) into the ipsilateral Sp5. Scale bars: 200 μm. Right, higher magnification views of areas indicated by dotted rectangle. Scale bars: 100 μm. White arrowheads indicate labeled mossy fibers in GCDs. (b,c) Lick-triggered response of DCN cells before (left) and after (right) injection of lidocaine (b, n = 10) or saline (c, n = 8) into Sp5. Thin lines are s.e.m. Solid black lines show the RMS amplitude of the licking sound (scale bars: 1 a.u.). (d) Lidocaine injection resulted in a significant increase in z-scored lick responses in DCN units (P = 0.0098, Wilcoxon Signed Rank Test, red) while no significant increases in z-scored lick responses occurred after saline injection (P = 0.31, Wilcoxon Signed Rank Test, purple). (e) Auditory responses to the mimic were not significantly different in lidocaine and saline groups (P = 0.87, Wilcoxon Rank Sum Test). (f) Lick rate did not differ before and after injection of lidocaine (P = 0.77, Wilcoxon Signed Rank Test) or saline (P = 0.25, Wilcoxon Signed Rank Test). (g) Changes in z-score lick responses were not correlated with changes in the maximum RMS of the licking sound after lidocaine injection (red, P = 0.36, linear regression t-test). Changes in the maximum RMS of the licking sound did not differ between lidocaine and saline groups (P = 0.51, Wilcoxon Rank Sum Test).

Adaptive cancellation of sounds correlated with behavior in DCN neurons

Studies of cerebellum-like structures in fish have shown that cancellation of self-generated inputs is not fixed but reflects an adaptive filtering process in which anti-Hebbian synaptic plasticity reduces correlations between principal cell activity and behavior-related signals conveyed by granule cells[14, 15]. Similar anti-Hebbian plasticity rules have been described at granule cell synapses in DCN[7-9]. Adaptive filtering in DCN would explain both how diverse sources of mossy fiber input are sculpted into patterns of synaptic input that selectively cancel responses to self-generated sounds and how such patterns are updated if the auditory consequences of a given behavior change. To test whether DCN is capable of adaptive filtering we delivered an external sound (broadband or bandpassed noise 5–15 kHz) temporally correlated with licking (30 ms after tongue contact). The conditions were the same as those for the experiments shown in Figure 3, except that many more sound presentations were used. Recordings were made using both glass microelectrodes and multi-site silicon probe electrodes (Supplementary Fig. 5). Use of the latter aided the maintenance of single-unit isolation through long bouts of licking. Responses of putative DCN output cells to the correlated sound declined over the course of several minutes of pairing (>1000 paired lick-sound presentations) (Fig. 6a–d,h, red lines). Such declines were not due to overall changes in firing rate, but rather were specific to the period of the noise-evoked response (Fig. 6a–d,h, black lines). Decreases in DCN responses to sounds correlated with licking are unlikely to reflect adaptation of peripheral auditory input as they were not observed in a separate group of DCN units in which an identical external sound was presented at the same rate but at random times relative to lick contact (Fig. 6e,f,h, yellow lines). Furthermore, no changes in sound-evoked responses were observed in VCN units when the external sound was temporally correlated with licking (Fig. 6g,h, blue lines). The magnitude of the reductions in response to acoustic stimuli correlated with licking varied substantially across DCN units (Fig. 6i). We found no relationships between the magnitude of such reductions and a number of behavioral and neural parameters, including licking rate, licking variability, baseline firing rate, and the initial magnitude of noise-evoked responses (Supplementary Fig. 6). More definitive criteria for identifying DCN cell types, such as juxtacellular labeling and antidromic stimulation, along with a thorough characterization of auditory response properties may, in future, provide insights into the source of this variation. Overall, these results are consistent with a plastic cancellation or adaptive filtering of self-generated stimuli in DCN similar to that described previously in cerebellum-like structures in fish.

Figure 6

Adaptive cancellation of sounds correlated with behavior in DCN

(a–d) Top, response of an example DCN unit to an acoustic stimulus (broadband or bandpassed filtered noise 5–15 kHz) presented correlated with lick onset. Lighter traces show responses to later licks, averaged in bins of 150 licks. Bottom, final response of this cell minus initial response to the sound plus lick. Thin lines are s.e.m. Left gray area shows the stimulus presentation period. Left dashed line shows the time of lick onset. Right dashed line and gray box show the mean and standard deviation, respectively, of the time of the next lick. (e, f) Same display for two DCN units for which the acoustic stimulus was played uncorrelated with the onset time of a lick. (g) Same display for an example VCN unit in which the acoustic stimulus was presented correlated with the onset time of a lick. (h) Group data showing average changes in noise-evoked responses over the course of repeated stimulus presentations. For DCN correlated units (red) the best fit decay rate was 0.0225 per 100 licks (n = 20, P = 6 x 10−15, linear regression t-test), for DCN uncorrelated units (yellow) the best fit decay rate was 0.001 but was not significantly different from 0 (n = 11, P = 0.42, linear regression t-test), and for VCN units (blue) the best fit decay rate was 0.001 but was not significantly different from 0 (n = 7, P = 0.56, linear regression t-test). Error bars are s.e.m. (i) Scatter plot of decay rates of best-fit exponentials fit separately for every unit. Horizontal black lines show the median value for each group. Open symbols correspond to the units used as examples in panels a–g.

Discussion

Distinguishing between external and self-generated sensory stimuli is fundamental for perception, and is thought to involve a comparison between external sensory input and internal reference signals related to the animal’s own behavior, for example motor corollary discharge or proprioception[34]. The present study provides evidence that such a comparison takes place at the first central stage of mammalian auditory processing in the DCN. More specifically, our results suggest a scheme similar to that already well-established in cerebellum-like structures in fish, in which behavior-related signals conveyed by a mossy fiber-granule cell-parallel fiber system cancel out responses to self-generated sensory stimuli in principal neurons[14, 15]. Several independent lines of evidence from the present study support such a function for DCN. First, responses to sounds generated by licking are substantially weaker in DCN compared to VCN and such differences are not accounted for by weaker responses to external acoustic stimuli in DCN or by an overall suppression of DCN responses during licking. Second, non-auditory responses to licking behavior are observed in putative CWCs, presumably due to non-auditory signals conveyed by mossy fibers and granule cells. Third, inactivation of Sp5, a prominent source of somatosensory mossy fiber input to DCN, revealed responses to self-generated sounds in DCN units that resembled those observed in VCN units, suggesting that such input normally functions to cancel DCN responses to self-generated sounds. Finally, repeated pairing of acoustic stimuli with licking resulted in a gradual reduction of DCN responses to the paired stimulus. Importantly, such reductions were not observed when stimuli were presented at the same rate but uncorrelated with the time of lick contact. Cancellation of self-generated sounds at an early processing stage could provide mammals with a dedicated channel through which salient or unexpected auditory signals can rapidly guide motor output, such as escape or orienting behavior. This interpretation is consistent with effects of DCN lesions, which disrupt orienting towards but not discriminating between sound source locations[35, 36] and the fact that, in addition to projecting to the inferior colliculus, DCN projects directly to auditory thalamus[37], auditory cortex[38], and regions involved in the acoustic startle response[39]. To our knowledge, responses to self-generated sounds have not been studied at the level of the inferior colliculus. Based on the present results, we would predict that a subset of inferior colliculus neurons selectively encodes external sounds and that this subset receives its dominant input from DCN rather than VCN. Though we focused on a single behavior and a single source of mossy fiber input, the fact that DCN receives mossy fiber inputs from numerous brains regions conveying a wide range of sensory and motor signals implies a much broader capacity for canceling predictable auditory input[6]. We also note that our results by no means rule out the possibility that additional sources of mossy fiber inputs (besides those originating from Sp5) play a role in cancelling self-generated sounds caused by licking behavior. For example, mossy fiber input to DCN granule cell domains originating from the pontine nuclei could provide motor-related signals relevant for cancelling the auditory consequences of the animal’s own movements, including licking[40]. Though our results suggest that the integration of non-auditory and auditory inputs to DCN serves to cancels responses to self-generated stimuli, they do not rule out other functions for multimodal integration in DCN. Numerous lines of evidence suggest that the DCN plays an important role in processing spectral cues for sound localization[6, 10, 35]. A recent study provided evidence that the integration of auditory and vestibular information in DCN could aid in distinguishing changes in auditory input due to motion of an external sound source from those due to self-motion[13]. Specifically, Wigderson et al. demonstrate that vestibular and auditory inputs are combined nonlinearly in putative DCN output cells. This is a different mode of integration from that suggested here and by studies of other cerebellum-like structures in fish in which behavior-related signals conveyed by mossy fibers are used to subtract out self-generated signals. Since vestibular inputs would not have been engaged during the head-fixed licking behavior we studied, no direct comparison between the two studies is possible. However, determining whether different sources of mossy fiber inputs to DCN, e,g. vestibular versus somatosensory, perform similar or different computations is an important question for future studies. Key questions remain regarding the circuit mechanisms underlying the cancellation of self-generated sounds in DCN reported here. In cerebellum-like structures in fish cancellation is due to the generation and subtraction of negative images of the responses of principal cells to self-generated inputs. Such negative images are formed by anti-Hebbian synaptic plasticity acting on corollary discharge, proprioceptive, and electrosensory signals conveyed by parallel fibers[14, 15]. Due both to limits on data collection imposed by satiation as well as the technical difficulty of maintaining stable single-unit recordings in brainstem through long bouts of licking we focused exclusively on providing evidence for cancellation. A crucial next step will be to determine whether cancellation of self-generated sounds in DCN is due to the generation of negative images. Furthermore, genetic tools available in mice should make it possible to perform a detailed dissection of the mechanisms underlying the cancellation of self-generated sounds in DCN. Key questions include the functional roles of specific cell types, such as the CWCs, and the roles of specific sites and mechanisms of plasticity, such as spike timing-dependent plasticity at parallel fiber synapses onto fusiform cells and CWCs described in vitro[7, 8, 41]. Finally, our results are intriguing from evolutionary and comparative perspectives. The brains of most vertebrates contain both a cerebellum and one or more sensory structures with circuitry closely resembling that of the cerebellum [2, 6, 15]. Though similarities between different cerebellum-like structures and the cerebellum are well-established in terms of their evolution, development, gene expression patterns, circuitry and synaptic plasticity, the question of whether they perform similar functions has been more difficult to address. Cerebellum-like structures associated with electrosensory processing in three distinct groups of fish have been shown to act as adaptive filters[14, 15] and numerous lines of evidence also exist supporting such a role for the mammalian cerebellum[42, 43]. In both cases granule cells convey a rich variety of signals[44-48] and a separate, non-plastic input (peripheral sensory input in the case of cerebellum-like structures and climbing fiber input in the case of cerebellum) instructs plasticity at granule cell synapses such that output that is predictable (in the case of cerebellum-like sensory structures) or associated with errors in motor performance (in the case of the cerebellum) is gradually reduced. Interestingly, adaptive cancellation of self-generated vestibular inputs has been demonstrated in neurons of the fastigial nucleus and vestibular nucleus in primates[49, 50]. Hence evidence provided here for sensory cancellation and adaptive filtering in DCN suggests that a core function may be shared by cerebellum-like structures and the cerebellum across vertebrate phylogeny.

Methods

All experimental protocols were approved by the Columbia University Institutional Animal Care and Use Committee. Adult male wild-type mice (129S6/SvEvTac) were used for all experiments. Mice were purchased from Taconic Biosciences (Hudson, NY) and housed in an on-site animal facility on a 12 hour light-dark cycle. Most experiments were performed during the light cycle. Data collection and analysis were not performed blind to the conditions of the experiments. All relevant data from this study are available from the authors upon reasonable request.

Surgery

Mice were anesthetized with isoflurane (1.5–2%) and placed in a stereotax equipped with zygomatic ear bars (Kopf Instruments). The skull was exposed and a small craniotomy 200–500 μm in diameter was made over the right dorsal cochlear nucleus (5.5 mm posterior to bregma and 2.3 mm lateral to the midline). The craniotomy was covered with silicon elastomer (Kwik-Sil, WPI, Sarasota, FL). A custom headplate was attached to the skull using dental cement (C&B Meta-bond, Parkell, Edgewood, NY). Mice were allowed to recover for 3 days prior to the start of experiments.

Experimental apparatus and auditory stimulus presentation

All mouse behavior and neurophysiology experiments were performed in a double walled sound-attenuating chamber (Double Deluxe Model, Gretchken Industries). The ambient noise within the chamber was <30 dB SPL as measured by a sound pressure level meter (Bruel and Kjaer Type 2240). A custom head fixation device was used to secure the animal via two attachment points to a stainless steel headplate and allowed for consistent positioning across multiple recording sessions. The animal’s body was additionally secured between two pieces of styrofoam molded to its body. A stainless steel lick spout was positioned in front of the animal’s mouth and licks were detected using standard methods. Acoustic stimuli were generated using Spike2 software (Cambridge Electronic Design) and delivered using an electrostatic speaker (ES-1 Tucker Davis Technologies) positioned approximately 10 cm in front of the mouse just to the right of the midline. Sound pressure levels of acoustic stimuli as measured in dB SPL were calibrated to the location of the animal’s right ear. The frequency response of the sound system was measured to be flat (+/− 4dB) from 1 kHz to 50 kHz using a ¼″ condenser microphone (377C01, PCP piezotronics), attached to a preamplifier (426B03, PCP piezotronics) positioned at the location of the mouse’s right ear. Sounds caused by licking were monitored by a small electret microphone (Knowles model 23329N) placed just above the lick spout. Microphone signals were sampled at 100 kHz and digitized using an analog to digital converter (Power 1401, Cambridge Electronic Design). The lick mimic was constructed from segments of microphone recordings 50 ms before tongue contact to 150 ms after tongue contact, and bandpass filtered between 1 and 50 kHz (n = 5 mice). We transformed each segment to a spectrogram using a short-time Fourier transform (Hamming window with a width of 10.24 ms and a stride of 5.12 ms). We then constructed the mimic by performing principal component analysis on this set of lick-triggered spectrograms and making a weighted sum of the first five principal components. This resulted in a mimic spectrogram, which we used as a spectro-temporal filter to convolve with a random signal. This resulted in a stimulus (the lick mimic) which contained the most prominent spectro-temporal features of the licking sound (including distinct spectral peaks at 2, 8, and 30 kHz) with little power elsewhere. Due to issues such as bone conduction we could not measure the exact loudness of natural licking sounds. The lick mimic was replayed at a loudness that evoked a response in VCN units that was, on average, similar to that evoked by licking. This same loudness (12 dB SPL) was used subsequently for all experiments involving the mimic.

Behavioral training

Mice were allowed to recover 3 days after surgery before to beginning water deprivation and habituation to head restraint in the experimental apparatus. Weight was monitored daily and additional water was given in the home cage if the animal’s weight fell below 80% of its initial pre-surgical weight. Extracellular recordings from DCN and VCN units were then performed during daily sessions lasting 2–3 hours. Mice licked roughly 3,000 times per session.

Extracellular recording and identification of DCN and VCN neurons

Standard procedures were used for extracellular recording using glass microelectrodes (5–20 MΩ resistance). Pipettes with a long taper were used to avoid tissue damage. On the day of recording, mice were placed into the head restraint and the silicone elastomer was removed and 0.9% saline was placed over the exposed craniotomy. The microelectrode was lowered into the craniotomy vertically. As the electrode was advanced through the cerebellum a series of 200 ms long search tones from 5 kHz to 50 kHz (in 5 kHz steps) were delivered. Entrance into DCN was marked by a transient increase in electrode resistance along with the sudden appearance of tone-evoked multi-unit activity which occurred ~2700–3200 μm below the surface of the cerebellum. The microelectrode was then advanced in 1 μm steps until a unit was isolated. Complex-spiking units were the first units encountered on an electrode penetration through DCN and could be unambiguously identified based on their distinctive complex spikes. Complex spikes are stereotyped, high-frequency action potential bursts superimposed on a slower depolarization and are not observed in any DCN cell types except CWCs[24, 25]. Similar to previous in vivo extracellular recording studies of DCN in a variety of species, including mouse[18], we defined complex spikes as high-frequency bursts (ISIs < 3.5 ms) of 2–5 action potentials. Complex spikes were identified automatically in Spike2 using custom written scripts and then confirmed individually. Within such bursts, action potentials successively widened and decreased in amplitude (Fig. 4b). Complex-spiking units were isolated 50–200 μm from the surface of the DCN. DCN units lacking complex spikes, referred to here as simple spiking units, were isolated 100–300 μm from the surface of the DCN. Complex-spiking units were never found ventral to simple spiking units on the same electrode penetration consistent with the known cytoarchitecture of the DCN. Passage from DCN into VCN was determined by monitoring the tone frequency that most strongly drove multi-unit activity for each 50 μm advance of the electrode. As the electrode advanced ventrally, the best frequency for driving multi-unit activity progressively decreased. A sudden increase in the best frequency (generally from ~5 kHz to ~20 kHz and usually occurring 500–600 μm below the surface of DCN) signified entrance into the VCN. Units which were isolated at least 100 μm ventral to the best frequency reversal (~800–1000 μm below the surface of the DCN) and which showed clear tone-evoked responses were classified as VCN units. Units isolated less than 100 μm from the best frequency reversal were not included in the analysis. Histological verification of DCN and VCN recording sites was performed by iontophoresis of dextran conjugated Alexa Flour 594 (D22913, Thermo Fisher Scientific) at recording sites between 100 and 300 μm below the surface of DCN (depths at which most DCN simple spiking units were isolated) and at 900 μm (the depth at which most VCN units were isolated). Only units that remained well-isolated through at least 75 licks were included in the analysis. Sounds associated with licking contain most power between 2–15 kHz, which corresponds to the lower portion of the mouse hearing range. For this reason we focused our recordings on regions of the cochlear nucleus that represent these frequencies. A subset of the recordings in Figure 6 (DCN correlated, n = 10/20; DCN uncorrelated, n = 5/11; VCN correlated, n = 3/7) were performed using a 16 channel silicon probe (Neuronexus, A1x16–5mm-25–177-A16). Silicon probe recordings proved superior to glass microelectrode recordings in terms of their stability during licking behavior. Probes consisted of a vertical linear array of 15 micron diameter electrode sites spaced 25 microns apart. Impedances ranged from ~2–6 kOhms. Recording tracks were made in DCN or VCN until a well-isolated single unit emerged on at least one electrode site. Most sites exhibited only multi-unit activity and were not analyzed. The same electrophysiological signatures described above were used to identify the dorsal and ventral cochlear nuclei. Rank sum tests revealed no difference between probe and glass recordings in the median decay rate of cells in all three groups shown in Figure 6 (DCN correlated: P = 0.09, DCN uncorrelated: P = 0.79, VCN: P = 0.63).

Viral Injections

A nanoliter injector (504126, WPI instruments) was used to inject adeno-associated virus expressing green fluorescent protein. The pipette was positioned over the coordinates 7.2 mm posterior to bregma and 1.8 mm right of the midline and lowered until the tip touched the surface of the cerebellum. The pipette was then lowered 3.5 mm below the surface of the cerebellum to the base of the spinal trigeminal nucleus. 27 nL of the virus was injected in three 9 nL pulses. Virus was also injected at depths of 3.2, 2.9, and 2.7 mm below the surface of the cerebellum. The pipette was then slowly raised out of the cerebellum and the incision was closed using cyanoacrylate glue (Vetbond, 3M, Maplewood Minnesota). Two weeks after surgery, mice were anesthetized with ketamine/xylazine and perfused with 4% formaldehyde. The brains were dissected from the skull and allowed to post-fix in 4% formaldehyde overnight. They were then cryoprotected in a 30% sucrose solution and sectioned on a cryostat. Sections were then mounted on glass slides (Superfrost, Fisher Scientific, Waltham, MA), counterstained with DAPI, and imaged on a confocal microscope (Carl Zeiss Microscopy, Peabody, MA).

Deafening

Mice were deafened bilaterally. Surgery for deafening mice was performed using 2–4% isoflurane. An incision was made just posterior to the tragus and extended ventrally. The tympanum, malleus, and incus were visualized through the auditory meatus. Using fine forceps the tympanum was ruptured and the malleus and incus were removed. The stapes was removed exposing the oval window with care taken not to damage the stapedial artery. Using a 30 gauge needle, approximately 10–20 μL of 1.0 mg/mL kanamycin was injected through the oval window and into the cochlea. The middle ear was packed with gel foam and the mouse was allowed to recover in its home cage. Deafening was verified by lack of observable behavioral responses to acoustic stimuli and by recording sound evoked field potentials to broadband noise (50 ms, 6–90 dB SPL) in DCN ~75 μm below the first observed complex-spiking unit. This was done both before and 2 days after surgical deafening in each mouse. DCN recordings were performed 2–4 days after surgery. Recording locations within DCN were confirmed histologically using iontophoresis of dextran-conjugated Alexa 594 as described above.

Lidocaine injections into Sp5

A small craniotomy (~300 μm diameter) was made prior to attachment of the headplate at coordinates 7.2 mm posterior to bregma and 1.8 mm lateral to the midline and covered with silicon elastomer. On the day of the experiment, a glass micropipette with a long taper was pulled using a pipette puller (PC-10, Narishige Group) and manually broken to 3.5 um diameter under a microscope. The pipette was then filled with 2% lidocaine in 0.9% saline with care taken to avoid air bubbles in the tip. The pipette was then coupled to a micropressure injector (Pikospritzer MK III, Parker Instrumentation) and successful ejection of lidocaine was confirmed visually to ensure tip was not clogged. The lidocaine pipette was advanced into Sp5 at an angle of 12.8 degrees. For Sp5 inactivation DCN unit responses were recorded for ~200 licks before ~100 nL of lidocaine was injected in a single pulse. Location of the lidocaine pipette within DCN was verified histologically using iontophoresis of dextran-conjugated Alexa 594 as described above.

Lick-sound pairing

After isolation of a unit, access to water was given and contact to the lick spout by the animal’s tongue was paired with a 30 ms noise (15–71 dB SPL, broadband or bandpassed filtered 5–15 kHz). In the correlated condition the noise was presented 30 ms after contact with the lick spout. The pairing was conducted continuously until the animal stopped licking or unit isolation was lost. In the uncorrelated condition presentation of the noise during licking was unrelated to the tongue’s contact with the spout and was instead presented at random intervals of 120–160 ms. Since these intervals are similar to inter-lick intervals the overall rate of sound presentations was similar in the correlated and uncorrelated conditions. Correlated versus uncorrelated conditions were tested in the same mice on alternating sessions. The condition to be tested during a given session was pre-determined prior to isolating a unit.

Data analysis and statistics

All analyses were performed using custom written scripts for Matlab (Mathworks, Natick, MA) and Spike 2. No statistical methods were used to predetermine sample sizes. Comparisons between two groups were made by Mann–Whitney U-test or Wilcoxon signed rank test for paired groups. Tests of the significance of linear regression slopes used a linear regression t-test. For the linear regression t-test residuals were assumed to be normally distributed but this was not formally tested. Differences were considered statistically significant at P < 0.01. Data are presented as mean ± s.e.m. unless indicated otherwise.

Lick sound spectrograms

To compute the average spectrogram of the sound associated with licking we first bandpass filtered raw microphone traces removing frequencies below 1 kHz and above 50 kHz (the highest frequency that could be detected by our equipment). 300 ms segments of the filtered microphone recording centered on the onset of each lick were transformed with a short time Fourier transform (Hamming window with a width of 10.24 ms and a stride of 5.12 ms) to obtain a set of lick-centered spectrograms. These were averaged to obtain a lick-triggered average spectrogram. Time-frequency peaks were found by first applying a 2-D median filter (widths 290 Hz, 3 ms) to individual spectrograms and then convolving with a 2-D Gaussian kernel with widths 1.5 kHz and 20 ms. We then calculated local time-frequency maximums by finding local maximums of the filtered spectrograms.

RMS amplitude of microphone traces

To compute the RMS amplitude of the sound associated with licking microphone recordings were first bandpass filtered (1–50 kHz). We then computed the RMS amplitude of this filtered microphone trace by convolving the squared trace with a moving average kernel of width 1 ms and taking the square root of the result. These recordings were then aligned to the time of tongue contact with the lick spout and averaged across licks.

Average and Z-scored electrophysiological responses during licking and mimic presentation

To compute average responses to licking or during delivery of the mimic spike trains were convolved with a normalized sum-of-two-exponentials kernel, with a rise time of 5 ms and a decay time of 20 ms. Averages were aligned either on tongue contact with the lick spout or mimic delivery and average baseline firing was subtracted. Baseline firing rates was taken to be the average firing rates in periods at least 25 ms before the next lick or mimic onset and at least 150 ms after the previous lick or mimic onset. Peak-to-trough firing rates were computed by taking the average licking or mimic response in a 200 ms window centered on the tongue-to-spout contact or mimic onset and determining the difference in the maximum to minimum firing rates. To compute z-scores we first took the maximum of the average licking or mimic response in a 200 ms window centered on tongue-to-spout contract or mimic onset. We then created shuffled spike trains of approximately the same length as the original spike train by randomly sampling from the inter-spike-interval distribution of the real spike train. Each shuffled spike train was convolved with the same kernel as the real spike train, its lick- or mimic-triggered average computed, and the maximum firing rate of this triggered average taken in the same 200 ms window. This was repeated 500 times and the maximum of the triggered average of the real spike train was expressed in units of the standard deviation from the mean of the shuffle distribution, i.e. z-scored based on the shuffle distribution. We determined the significance of neural responses by computing approximate p-values for the recorded maximum lick-triggered rate, which were estimated by the fraction of shuffled-spike trains showing maximum lick-triggered responses greater than that of the real spike train.

Correlated and uncorrelated sound-lick pairings

The noise-evoked response is defined in bins of 150 stimulus presentations. For each 150 presentations the response is defined as the maximum of the average noise-evoked response during that stimulus period minus the baseline rate during that period. For each unit the response is normalized to equal one in the first bin. We performed a linear regression between the stimulus bin and the log of the noise-evoked responses for each population, in order to extract a decay rate for each population.

46 in total

1. Bidirectional synaptic plasticity in the cerebellum-like mammalian dorsal cochlear nucleus.

Authors: Kiyohiro Fujino; Donata Oertel
Journal: Proc Natl Acad Sci U S A Date: 2002-12-16 Impact factor: 11.205

2. Cell-specific, spike timing-dependent plasticities in the dorsal cochlear nucleus.

Authors: Thanos Tzounopoulos; Yuil Kim; Donata Oertel; Laurence O Trussell
Journal: Nat Neurosci Date: 2004-06-20 Impact factor: 24.884

3. Single-neuron recordings from unanesthetized mouse dorsal cochlear nucleus.

Authors: Wei-Li Diana Ma; Stephan D Brenowitz
Journal: J Neurophysiol Date: 2011-11-09 Impact factor: 2.714

4. Responses to tones and noise of single cells in dorsal cochlear nucleus of unanesthetized cats.

Authors: E D Young; W E Brownell
Journal: J Neurophysiol Date: 1976-03 Impact factor: 2.714

Review 5. Somatosensory influence on the cochlear nucleus and beyond.

Authors: Susan E Shore; Jianxun Zhou
Journal: Hear Res Date: 2006-03-02 Impact factor: 3.208

6. Physiology and morphology of complex spiking neurons in the guinea pig dorsal cochlear nucleus.

Authors: P B Manis; G A Spirou; D D Wright; S Paydar; D K Ryugo
Journal: J Comp Neurol Date: 1994-10-08 Impact factor: 3.215

7. Identification of response properties of ascending axons from dorsal cochlear nucleus.

Authors: E D Young
Journal: Brain Res Date: 1980-10-27 Impact factor: 3.252

8. Intracellularly labeled fusiform cells in dorsal cochlear nucleus of the gerbil. I. Physiological response properties.

Authors: Kenneth E Hancock; Herbert F Voigt
Journal: J Neurophysiol Date: 2002-05 Impact factor: 2.714

9. Sensory-motor interaction in the primate auditory cortex during self-initiated vocalizations.

Authors: Steven J Eliades; Xiaoqin Wang
Journal: J Neurophysiol Date: 2002-12-11 Impact factor: 2.714

10. Coactivation of pre- and postsynaptic signaling mechanisms determines cell-specific spike-timing-dependent plasticity.

Authors: Thanos Tzounopoulos; Maria E Rubio; John E Keen; Laurence O Trussell
Journal: Neuron Date: 2007-04-19 Impact factor: 17.173

32 in total

Review 1. Corollary Discharge Signals in the Cerebellum.

Authors: Abigail L Person
Journal: Biol Psychiatry Cogn Neurosci Neuroimaging Date: 2019-05-02

2. Dorsal Cochlear Nucleus Fusiform-cell Plasticity is Altered in Salicylate-induced Tinnitus.

Authors: David T Martel; Thibaut R Pardo-Garcia; Susan E Shore
Journal: Neuroscience Date: 2018-09-12 Impact factor: 3.590

3. Multisensory activation of ventral cochlear nucleus D-stellate cells modulates dorsal cochlear nucleus principal cell spatial coding.

Authors: Calvin Wu; Susan E Shore
Journal: J Physiol Date: 2018-08-18 Impact factor: 5.182

4. Listening to music while running alters ground reaction forces: a study of acute exposure to varying speed and loudness levels in young women and men.

Authors: Andrea Manca; Lucia Cugusi; Luca Pomidori; Michele Felisatti; Giorgio Altavilla; Eleonora Zocca; Martina Zocca; Francesco Bussu; Zeevi Dvir; Franca Deriu
Journal: Eur J Appl Physiol Date: 2020-04-10 Impact factor: 3.078

5. Auditory-somatosensory bimodal stimulation desynchronizes brain circuitry to reduce tinnitus in guinea pigs and humans.

Authors: Kendra L Marks; David T Martel; Calvin Wu; Gregory J Basura; Larry E Roberts; Kara C Schvartz-Leyzac; Susan E Shore
Journal: Sci Transl Med Date: 2018-01-03 Impact factor: 17.956

Review 9. How Movement Modulates Hearing.

Authors: David M Schneider; Richard Mooney
Journal: Annu Rev Neurosci Date: 2018-07-08 Impact factor: 12.449

Review 10. Re-evaluating Circuit Mechanisms Underlying Pattern Separation.

Authors: N Alex Cayco-Gajic; R Angus Silver
Journal: Neuron Date: 2019-02-20 Impact factor: 17.173