| Literature DB >> 31133646 |
Heidi Solberg Økland1, Ana Todorović2, Claudia S Lüttke3, James M McQueen3,4, Floris P de Lange3.
Abstract
In language comprehension, a variety of contextual cues act in unison to render upcoming words more or less predictable. As a sentence unfolds, we use prior context (sentential constraints) to predict what the next words might be. Additionally, in a conversation, we can predict upcoming sounds through observing the mouth movements of a speaker (visual constraints). In electrophysiological studies, effects of visual constraints have typically been observed early in language processing, while effects of sentential constraints have typically been observed later. We hypothesized that the visual and the sentential constraints might feed into the same predictive process such that effects of sentential constraints might also be detectable early in language processing through modulations of the early effects of visual salience. We presented participants with audiovisual speech while recording their brain activity with magnetoencephalography. Participants saw videos of a person saying sentences where the last word was either sententially constrained or not, and began with a salient or non-salient mouth movement. We found that sentential constraints indeed exerted an early (N1) influence on language processing. Sentential modulations of the N1 visual predictability effect were visible in brain areas associated with semantic processing, and were differently expressed in the two hemispheres. In the left hemisphere, visual and sentential constraints jointly suppressed the auditory evoked field, while the right hemisphere was sensitive to visual constraints only in the absence of strong sentential constraints. These results suggest that sentential and visual constraints can jointly influence even very early stages of audiovisual speech comprehension.Entities:
Mesh:
Year: 2019 PMID: 31133646 PMCID: PMC6536519 DOI: 10.1038/s41598-019-44311-2
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Experimental task, sensor selection, N400 and latency results. (A) Trial sequence. A video displays a speaker uttering a Dutch sentence for about four seconds. The final word of the sentence appears after a short blank screen (320 ms). In this example, the auditory contents of the sentence-final target word are made more predictable due to the salient viseme (/f/), and the constraining sentential context. The target word here (‘fire’) is followed by a written word, presented on 20% of the trials. Participants pressed a button to indicate whether it was the same as the previous word. (B) N400-effect of sentence context. Left: topography of the difference in activity to words preceded by a sententially unconstrained vs. constrained context, with significant sensors highlighted. Sententially unconstrained words led to stronger neural activity over temporal, parietal and frontal sensors. Right: event-related fields to sententially unconstrained (purple) and constrained sentences (orange), averaged over sensors highlighted on the left. (C) Sensors of interest with average N1 topography to videos of single words. The most active left and right temporal sensors are highlighted. (D) Event-related fields to words beginning with salient (green line) and non-salient visemes (black line), plotted separately for sensors in the left and right hemisphere. Jackknifed auditory N1 peak latencies for each subject are represented by dots, their means by the vertical lines. In the left sensors, the auditory N1 peaked earlier if the viseme was salient than if it was not.
Target words. IPA = International Phonetic Alphabet.
| Target words | English translation | IPA | Frequency | Acoustic word duration (ms) | Visual word duration (ms) | Total clip duration (ms) |
|---|---|---|---|---|---|---|
| fik | fire | [f | 6.49 | 293 | 440 | 1000 |
| film | film | [f | 174.28 | 372 | 720 | 1160 |
| filter | filter | [ˈf | 2.04 | 541 | 640 | 1040 |
| fit | in good shape | [f | 5.37 | 373 | 600 | 1040 |
| gif | poison | [x | 13.56 | 400 | 960 | 1320 |
| gil | scream/shriek | [xɪl] | 9.99 | 330 | 640 | 1040 |
| gisteren | yesterday | [ˈx | 131.79 | 513 | 640 | 1000 |
| gips | plaster/gypsum | [x | 2.36 | 492 | 680 | 1040 |
Frequency did not differ significantly between /f/- and /x/-words.
The four experimental conditions with example sentences.
| Salient viseme | Non-salient viseme | |
|---|---|---|
| Sententially constrained | Het brandhout vloog meteen in de FIK
| De tanden van een cobra bevatten dodelijk GIF
|
| Sententially unconstrained | Toen Roel thuiskwam stonden zijn spullen in de FIK
| De biologiestudenten lezen een artikel over GIF
|
Figure 2Main effects and interactions on ERF amplitudes. (A) Topography of the main effects of sentential constraints and viseme salience in the time window of the significant three-way interaction between hemisphere, viseme salience and sentence context. Sensors of interest are highlighted. (B) Left - Topography of the interaction. Right - Individual subject representation of the interaction, in two most prominent sensors on each side. Each red and blue dot represents the difference in the signal between non-salient and salient visemes, under different conditions of sentential constraint. (C) ERFs for the salient and non-salient visemes in the two hemispheres under different conditions of sentential constraint. Clusters of significant differences are highlighted. Dots in the upper plots (left hemisphere) represent individual jackknife-estimated N1 latencies, with their means represented as vertical lines.