| Literature DB >> 27903727 |
Katerina Danae Kandylaki1,2,3, Arne Nagels4, Sarah Tune5, Tilo Kircher4, Richard Wiese2, Matthias Schlesewsky6, Ina Bornkessel-Schlesewsky6.
Abstract
The hierarchical organization of human cortical circuits integrates information across different timescales via temporal receptive windows, which increase in length from lower to higher levels of the cortical hierarchy (Hasson et al., 2015). A recent neurobiological model of higher-order language processing (Bornkessel-Schlesewsky et al., 2015) posits that temporal receptive windows in the dorsal auditory stream provide the basis for a hierarchically organized predictive coding architecture (Friston and Kiebel, 2009). In this stream, a nested set of internal models generates time-based ("when") predictions for upcoming input at different linguistic levels (sounds, words, sentences, discourse). Here, we used naturalistic stories to test the hypothesis that multi-sentence, discourse-level predictions are processed in the dorsal auditory stream, yielding attenuated BOLD responses for highly predicted versus less strongly predicted language input. The results were as hypothesized: discourse-related cues, such as passive voice, which effect a higher predictability of remention for a character at a later point within a story, led to attenuated BOLD responses for auditory input of high versus low predictability within the dorsal auditory stream, specifically in the inferior parietal lobule, middle frontal gyrus, and dorsal parts of the inferior frontal gyrus, among other areas. Additionally, we found effects of content-related ("what") predictions in ventral regions. These findings provide novel evidence that hierarchical predictive coding extends to discourse-level processing in natural language. Importantly, they ground language processing on a hierarchically organized predictive network, as a common underlying neurobiological basis shared with other brain functions. SIGNIFICANCE STATEMENT: Language is the most powerful communicative medium available to humans. Nevertheless, we lack an understanding of the neurobiological basis of language processing in natural contexts: it is not clear how the human brain processes linguistic input within the rich contextual environments of our everyday language experience. This fMRI study provides the first demonstration that, in natural stories, predictions concerning the probability of remention of a protagonist at a later point are processed in the dorsal auditory stream. Results are congruent with a hierarchical predictive coding architecture assuming temporal receptive windows of increasing length from auditory to higher-order cortices. Accordingly, language processing in rich contextual settings can be explained via domain-general, neurobiological mechanisms of information processing in the human brain.Entities:
Keywords: auditory; dorsal stream; fMRI; language; predictive coding; temporal receptive windows
Mesh:
Year: 2016 PMID: 27903727 PMCID: PMC5148219 DOI: 10.1523/JNEUROSCI.4100-15.2016
Source DB: PubMed Journal: J Neurosci ISSN: 0270-6474 Impact factor: 6.167
Figure 1.Predictions are propagated through the different levels from sentence through word to phoneme sequence processing. At the same time, prediction errors are propagated through the hierarchy in the opposite direction (Bornkessel-Schlesewsky et al., 2015). Reprinted with permission from Bornkessel-Schlesewsky et al. (2015).
The conditions refer to the discourse in which the referent was introduced
| High causality | Low causality | |
|---|---|---|
| Active voice | Active-high (AH) | Active-low (AL) |
| Passive voice | Passive-high (PH) | Passive-low (PL) |
BOLD response was measured on the remention of the referent in Sentence C.
Mean (SD) of ratings for naturalness, probability, plausibility, and comprehensibility as well as for event causality quantified in the Dowty (1991) proto-role properties
| Condition | NAT | PRO | PLA | COM | VOL | SEN | CAU | MOA | MOU | CHA | CHU |
|---|---|---|---|---|---|---|---|---|---|---|---|
| AH | 3.2 (0.8) | 3.0 (0.9) | 2.6 (0.8) | 3.6 (0.6) | 3.6 (0.8) | 3.6 (0.7) | 3.3 (0.9) | 3.6 (0.8) | 1.8 (0.9) | 3.2 (0.9) | 3.3 (0.8) |
| AL | 3.2 (0.8) | 2.9 (0.9) | 2.5 (0.8) | 3.6 (0.7) | 3.0 (1) | 3.0 (0.9) | 2.7 (1) | 2.8 (1) | 2.2 (1) | 2.7 (1) | 2.7 (1) |
| PH | 3.2 (0.8) | 2.9 (0.9) | 2.5 (0.8) | 3.6 (0.6) | 3.5 (0.9) | 3.6 (0.7) | 3.2 (0.9) | 3.6 (0.8) | 1.8 (0.9) | 3.3 (0.8) | 3.3 (0.8) |
| PL | 3.1 (0.8) | 2.9 (0.9) | 2.5 (0.8) | 3.6 (0.6) | 3.2 (1) | 3.1 (0.9) | 2.9 (1) | 2.9 (1) | 2.2 (1) | 2.9 (1) | 2.9 (1) |
NAT, Naturalness; PRO, probability; PLA, plausibility; COM, comprehensibility; VOL, volition; SEN, sentience; CAU, causation; MOA, motion agent; MOU, motion undergoer; CHA, change agent; CHU, change undergoer; AH, active-high; AL, active-low; PH, passive-high; PL, passive-low.
p values of the model comparisons for the questionnaire ratings
| Comparison | NAT | PRO | PLA | COM | VOL | SEN | CAU | MOA | MOU | CHA | CHU |
|---|---|---|---|---|---|---|---|---|---|---|---|
| N versus C | 0.198 | 0.669 | 0.865 | 0.171 | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 |
| N versus V | 0.521 | 0.238 | 0.722 | 0.917 | 0.770 | 0.931 | 0.662 | 0.096 | 0.294 | 0.025 | 0.020 |
N versus C is the comparison of the null model (only random factors) to the model with main effect of causality. N versus V is the comparison of the null model (only random factors) to the model with main effect of voice. NAT, Naturalness; PRO, probability; PLA, plausibility; COM, comprehensibility; VOL, volition; SEN, sentience; CAU, causation; MOA, motion agent; MOU, motion undergoer; CHA, change agent; CHU, change undergoer.
Mean (SD) of duration, average intensity, and average frequency for the context sentence and for the referent
| Condition | Context sentence | Referent | ||||||
|---|---|---|---|---|---|---|---|---|
| AH | AL | PH | PL | AH | AL | PH | PL | |
| Duration (in seconds) | 2.854 (0.820) | 2.559 (0.710) | 3.232 (0.864) | 3.053 (0.737) | 0.610 (0.131) | 0.615 (0.163) | 0.589 (0.144) | 0.609 (0.163) |
| Average intensity (in dB) | 48 (2) | 48 (2) | 48 (2) | 48 (2) | 51 (4) | 51 (4) | 51 (4) | 52 (4) |
| Average frequency (in Hz) | 185 (9) | 185 (8) | 185 (9) | 183 (9) | 197 (16) | 197 (18) | 195 (17) | 205 (18) |
AH, Active-high; AL, active-low; PH, passive-high; PL, passive-low.
Inferential statistics for the acoustic analysis (p values of Kruskal–Wallis tests)
| Criterion | Context sentence | Referent | ||
|---|---|---|---|---|
| By voice | By causality | By voice | By causality | |
| Duration | 0.0006236 | 0.06314 | 0.6818 | 0.515 |
| Average intensity | 0.5898 | 0.8618 | 0.4779 | 0.3505 |
| Average frequency | 0.5572 | 0.6872 | 0.3262 | 0.07777 |
Clusters with their local maxima coordinates and neuroanatomical details of their extend for the interaction (INT) of voice and causality and for the main effects (MEF) of voice and causality (p < 0.005, cluster threshold 72 voxels, Monte Carlo corrected)
| Contrast | Anatomical region | H | MNI coordinates | F | Cluster size in voxels |
|---|---|---|---|---|---|
| INT voice and causality | L | −8, 14, 66 | 18.45 | 86 | |
| MEF voice | R | 50, −50, 44 | 17.10 | 405 | |
| L | −6, 26, 36 | 16.13 | 158 | ||
| R | 8, −18, 36 | 15.86 | 447 | ||
| R | 42, 28, 42 | 15.70 | 405 | ||
| R | 30, 18, 56 | 15.65 | 210 | ||
| R | 26, 56, 24 | 14.42 | 276 | ||
| R | 8, 2, −10 | 14.28 | 73 | ||
| R | 32, 24, −12 | 14.07 | 351 | ||
| L | −20, −48, −20 | 13.47 | 128 | ||
| L | −48, 42, 8 | 13.42 | 192 | ||
| L | 8, 42, −4 | 13.35 | 223 | ||
| −4, −58, −32 | 12.57 | 132 | |||
| L | −36, 26, 38 | 12.06 | 99 | ||
| MEF causality | R | 30, 12, −12 | 18.69 | 239 | |
| R | 50, −66, 8 | 18.12 | 225 | ||
| L | −34, −88, 30 | 17.24 | 81 | ||
| L | −56, −38, −6 | 14.81 | 148 | ||
| L | −42, −54, −14 | 12.84 | 112 |
Coordinates are listed in MNI atlas space. H, Hemisphere; BA, Brodmann area; PFm, subregion of the supramarginal gyrus (SMG) (for cytoarchitectonic features, see Caspers et al., 2006); PGa, rostral subregion of the angular gyrus (AG) (for cytoarchitectonic features, see Caspers et al., 2006); Fp1, subregion of the frontal pole (for cytoarchitectonic features, see Bludau et al., 2014); ORBinf, orbital part of the inferior frontal gyrus (IFG); IFGtriang, triangular part of the IFG (as abbreviated by Tzourio-Mazoyer et al., 2002).
Figure 2.Supra-threshold clusters (p < 0.005, cluster extend threshold 72 voxels) at the referent. Red represents the clusters for the main effect of voice. Yellow represents the clusters for the main effect of causality. Blue represents the cluster of the interaction. The slice numbers are included in the left bottom corner of the brain maps a–g. ACC, anterior cingulate cortex; AH, active high; AL, active low; CE, cerebellum; FFG, fusiform gyrus; IFG, inferior frontal gyrus; INS, insula; IPL, inferior parietal lobe; LH, left hemisphere; MCC, middle cingulate cortex; MEF, Main effect of; MFG, middle frontal gyrus; MOG, middle occipital gyrus; MTG, middle temporal gyrus; PH, passive high; PL, passive low; RH, right hemisphere; SMA, supplementary motor area.
Figure 3.Supra-threshold clusters (p < 0.005, cluster extend threshold 72 voxels) at the context sentence (Sentence A). Red represents the clusters for the main effect of voice. Yellow (and its darker shades) represents the clusters for the main effect of causality. Green represents the cluster of the interaction. ACC, anterior cingulate cortex; AG, angular gyrus; AH, active high; AL, active low; IFG, inferior frontal gyrus; LH, left hemisphere; MEF, Main effect of; MOG, middle occipital gyrus; MTG, middle temporal gyrus; PH, passive high; PL, passive low; PUT, putamen; RH, right hemisphere; SFG, superior frontal gyrus; SOG, superior occipital gyrus; SPL, superior parietal lobule; WM, white matter.