| Literature DB >> 36246505 |
Emmanuel Biau1,2, Danying Wang1,2, Hyojin Park1,2, Ole Jensen1,2, Simon Hanslmayr1,2,3.
Abstract
Audiovisual speech perception relies, among other things, on our expertise to map a speaker's lip movements with speech sounds. This multimodal matching is facilitated by salient syllable features that align lip movements and acoustic envelope signals in the 4-8 Hz theta band. Although non-exclusive, the predominance of theta rhythms in speech processing has been firmly established by studies showing that neural oscillations track the acoustic envelope in the primary auditory cortex. Equivalently, theta oscillations in the visual cortex entrain to lip movements, and the auditory cortex is recruited during silent speech perception. These findings suggest that neuronal theta oscillations may play a functional role in organising information flow across visual and auditory sensory areas. We presented silent speech movies while participants performed a pure tone detection task to test whether entrainment to lip movements directs the auditory system and drives behavioural outcomes. We showed that auditory detection varied depending on the ongoing theta phase conveyed by lip movements in the movies. In a complementary experiment presenting the same movies while recording participants' electro-encephalogram (EEG), we found that silent lip movements entrained neural oscillations in the visual and auditory cortices with the visual phase leading the auditory phase. These results support the idea that the visual cortex entrained by lip movements filtered the sensitivity of the auditory cortex via theta phase synchronization.Entities:
Keywords: Auditory processing; Entrainment; Lip movements; Theta oscillations
Year: 2021 PMID: 36246505 PMCID: PMC9559921 DOI: 10.1016/j.crneur.2021.100014
Source DB: PubMed Journal: Curr Res Neurobiol ISSN: 2665-945X
Fig. 1Experimental Paradigm of the Tone Detection Task (TDT). (A) In each trial, continuous white noise and a silent movie were presented together for 5 s. A first pure tone occurred randomly in the first half of the trial while a second tone occurred randomly in the second half of the trial. Participants were instructed to respond as fast and accurately as possible whenever they detected a tone. In the one tone condition, the white noise track contained only one tone that occurred randomly between the two halves of the trial. In the zero tone condition, the sound of the trial contained only white noise (N.B. The face of the speaker has been blurred only in the figure for anonymity purpose). (B) Distribution of tone onsets along the visual phase in the first (top) and second (bottom) windows of the main condition of interest. The x-axis represents the visual phase binned in equidistant bins (from -π to +π; π/8 step). The y-axis depicts the proportion of tones occurring in each bins across participants (in proportion from total tones ± standard error of the mean). The red dashed lines represent the theoretical proportion of tone onsets expected in each bin for a random uniform distribution (n bins = 16; 6.25% per bin).
Fig. 2Phasic modulation along the realigned phase in the two tones condition. Probability of correctly detected tones p(hits) = hits/(hits + misses) along the visual phase realigned on the preferred bin (0° bin, not plotted) in the two tones condition. The red line depicts p(hits) at the first tone and the green line depicts p(hits) at the second tone (mean ± standard deviation). The phase modulation was estimated by subtracting the average response of the two bins adjacent to the bin opposite to the preferred phase (white dots) from the average response of the two bins adjacent to the preferred phase (blue dots).
Fig. 3Visual entrainment and tone detection performance in the two tones condition. (A) Resultant vector length r from grand average phase at the onset of first and second tones across participants (hit trials: green line; miss trials: red line). The individual mean theta phases are depicted in polar coordinates (hit trials: green circles; miss trials: red circles). (B) Mean sensitivity index (d’) and (C) reaction times of first and second tone hits. The graphs depict the density, the grand average (mean ± standard deviation; errors bars indicate 5th and 95th percentiles), and individual means (grey dots) for first/second tones. Significant contrasts are evidenced with stars (p < 0.05).
Fig. 4Theta phase coupling analysis between visual and auditory areas during lips perception. (A) Difference of mutual information between the late and early time-windows (MIlate > MIearly contrast; z values; coordinate of the slice: z = 0). Auditory (Pink dot; MNI coordinates of maximum voxel: [-50 -21 0]; Left Middle Temporal cortex) and visual (Green dot; MNI coordinates of maximum voxel: [-40 -89 0]; Left Middle Occipital cortex) sources were localized in the left hemisphere. (B) MIearly > MIlate contrast projected on brain's surface for illustrative purpose: Synchronization was estimated through ϕA-V theta phase offset between theta oscillations at identified auditory (pink line) and visual sources (green line) by mean of phase coupling analysis. (C) Audio-visual phase coupling in the early and late time-windows corresponding to the time-windows containing the first and second tones in the TD Task. The mean ϕA-V offset between auditory and visual theta phases (red arrows) confirmed that oscillations entrained by lip movements in the visual cortex preceded oscillations in the auditory cortex by 75° (~37 ms) and 30° (~15 ms), respectively in the early and late time-windows. (D) Theta synchronization between visual and auditory areas improves with entrainment. The resultant vector length r of the distance between the observed ϕA-V and the theoretical ϕA-V = 0 was greater in the late than the early window, suggesting a dynamic communication reflected by a decrease of time lag between visual and auditory activities. The graphs depict the density, the grand average (mean ± standard deviation; errors bars indicate 5th and 95th percentiles), and individual resultant vector length r (grey dots). Significance evidenced with a star (p < 0.05).
| Reagent or resource | Source | Identifier |
|---|---|---|
| Software and Algorithms | ||
| MATLAB | The MathWorks | R2018a |
| Psychophysics Toolbox | 3 | |
| FieldTrip | v.20161231 | |
| SPM8 | Wellcome Trust Centre for Neuroimaging | 8 |
| ASIO4All | Steinberg Media Technologies | 2.12 |
| ActiView | BioSemi B.V. Amsterdam, Netherlands | 7 |
| Shotcut | Meltytech, LLC | v.18.06.02 |
| Brainstorm Toolbox | ||
| CARET | Washington University School of Medicine | 5.65 |
| Circular Statistics Toolbox | v.1.21.0.0 | |
| Other | ||
| BioSemi ActiveTwo system | BioSemi B.V. Amsterdam, Netherlands | EEG system |
| ER-3C system | Etymotic Research, Elk Grove Village, IL | EEG compatible earphones |
| Fastrak | Polhemus, Colchester, VT, USA | Electromagnetic digitiser |
| ThorLabs DET36A | Photodetector | |