| Literature DB >> 31396064 |
Abstract
Converging results suggest that perception is controlled by rhythmic processes in the brain. In the auditory domain, neuroimaging studies show that the perception of sounds is shaped by rhythmic activity prior to the stimulus, and electrophysiological recordings have linked delta and theta band activity to the functioning of individual neurons. These results have promoted theories of rhythmic modes of listening and generally suggest that the perceptually relevant encoding of acoustic information is structured by rhythmic processes along auditory pathways. A prediction from this perspective-which so far has not been tested-is that such rhythmic processes also shape how acoustic information is combined over time to judge extended soundscapes. The present study was designed to directly test this prediction. Human participants judged the overall change in perceived frequency content in temporally extended (1.2-1.8 s) soundscapes, while the perceptual use of the available sensory evidence was quantified using psychophysical reverse correlation. Model-based analysis of individual participant's perceptual weights revealed a rich temporal structure, including linear trends, a U-shaped profile tied to the overall stimulus duration, and importantly, rhythmic components at the time scale of 1-2 Hz. The collective evidence found here across four versions of the experiment supports the notion that rhythmic processes operating on the delta time scale structure how perception samples temporally extended acoustic scenes.Entities:
Keywords: auditory perception; delta band; hearing; perceptual weights; reverse correlation; rhythmic perception; theta band
Year: 2019 PMID: 31396064 PMCID: PMC6663999 DOI: 10.3389/fnhum.2019.00249
Source DB: PubMed Journal: Front Hum Neurosci ISSN: 1662-5161 Impact factor: 3.169
Figure 1Acoustic stimuli and analysis. (A) Stimuli consisted of “soundscapes” consisting of 30 four-tone sequences either in- or de-creasing in frequency (example sequences are marked by black dots). The fraction of sequences moving in the same direction changed randomly across trials and between “epochs” of a specific duration, which varied between experiments (see Table 1). (B) Each trial was characterized by the level of motion evidence for the soundscape to in- or de-crease, with the evidence being independent between epochs (periods of constant evidence) and trials, and varying around a participant-specific threshold. The black line presents the evidence for the soundscape shown in panel (A), the gray lines the evidence for other trials, all with (on average) increasing frequency. An evidence of 1 correspond to a fully coherent soundscape, evidence of 0 to a completely random soundscape (15 tone sequences increasing, 15 decreasing). (C) The trial-averaged single participant perceptual weights (average sensory evidence for trials where participants responded with “up” or “down,” combined after correcting the sign of down responses) were analyzed using regression models. These models distinguished trivial temporal structure arising from linear trends or U/V shaped profiles locked to stimulus duration (blue) from rhythmic structure at faster time scales (red). The black graph displays the perceptual weight of one example participant together with the best-fitting trivial and rhythmic contributions. (D) Acoustic properties of these soundscapes, shown here for Experiment 3. Upper panel: frequency spectrum revealing an approximate 1/f structure. Middle and lower panels: temporal modulation spectra, derived as the frequency spectrum of band-limited envelopes at different frequencies (color-coded). The middle panel reveals a peak at 33 Hz, the duration of individual tones. The right panel shows the lack of specific modulation peaks at the behaviorally relevant range between 1 Hz and 5 Hz, as well as a lack of difference between soundscapes with in- and de-creasing frequency content. All spectra are averaged across all trials and participants (n = 20; Experiment 3).
Parameters and results for each experiment, including the duration of sound scape and the “epochs” over which the sensory evidence changed randomly, the number of epochs (sampling frequency) over which the perceptual weights were determined (Df), the number of participants (N), the best frequency determined by each model criterion (Fpeak) and the relative model criterion vs. the best trivial model.
| Soundscape | Epoch | Δcv-AICc | ΔWAIC | |||||
|---|---|---|---|---|---|---|---|---|
| Experiment 1 | 1,800 ms | 180 ms (5.5 Hz) | 10 | 23 | 1.6 Hz | 183 | 1.6 Hz | 186 |
| Experiment 2 | 1,600 ms | 90 ms (11 Hz) | 17 | 23 | 1.3 Hz | 32 | 1.3 Hz | 71 |
| Experiment 3 | 1,200 ms | 120 ms (8.3 Hz) | 10 | 20 | 2 Hz | 121 | 2 Hz | 135 |
| Experiment 4 | 1,700 ms | 120 ms (8.3 Hz) | 14 | 20 | 1.2 Hz | 13 | 2 Hz | 46 |
Figure 2Results. (A) Participant-averaged perceptual weights (black solid) with group-level two-sided 5% bootstrap confidence intervals (gray area) and the best-fitting model (green dashed). Units are in z-scores relative to a within-participant bootstrap baseline. (B) Participant-averaged frequency spectrum of the perceptual weights (black solid), with two-sided 5% bootstrap confidence interval (gray area). The dashed line represents the average spectrum obtained after time-shuffling the weights. (C) Model comparison results. The left curves show the group-level cv-AICc (blue; left y-axis) and watanabe-akaike information criterion (WAIC; red; right y-axis) values for the best trivial model (open circles) and frequency dependent models. Individual participant’s best frequencies are denoted by solid black dots. The bars on the right show the exceedance probabilities of a comparison between the trivial model and the rhythmic model derived at the frequency yielding the lowest group-level cv-AICc. (D) Model parameters (betas) of the trivial contributions (offset, linear slope, u/v shaped profile) and the rhythmic component (R, root-mean-squared amplitude of sine and cosine components). Bars and error-bars indicate the group-level mean and SEM. (E) Rhythmic component of the best-fitting model for each participant (lines) and phase of this component each participant (inset). In panels (B,C) the gray transparent boxes black out those frequency ranges which cannot be faithfully reconstructed given the experiment specific epoch duration (i.e., behavioral sampling rate); hence only the clear regions are meaningful.