| Literature DB >> 29524274 |
Tero Hakala1,2, Annika Hultén1,2, Minna Lehtonen3,4,5, Krista Lagus6, Riitta Salmelin1,2.
Abstract
Neuroimaging studies of the reading process point to functionally distinct stages in word recognition. Yet, current understanding of the operations linked to those various stages is mainly descriptive in nature. Approaches developed in the field of computational linguistics may offer a more quantitative approach for understanding brain dynamics. Our aim was to evaluate whether a statistical model of morphology, with well-defined computational principles, can capture the neural dynamics of reading, using the concept of surprisal from information theory as the common measure. The Morfessor model, created for unsupervised discovery of morphemes, is based on the minimum description length principle and attempts to find optimal units of representation for complex words. In a word recognition task, we correlated brain responses to word surprisal values derived from Morfessor and from other psycholinguistic variables that have been linked with various levels of linguistic abstraction. The magnetoencephalography data analysis focused on spatially, temporally and functionally distinct components of cortical activation observed in reading tasks. The early occipital and occipito-temporal responses were correlated with parameters relating to visual complexity and orthographic properties, whereas the later bilateral superior temporal activation was correlated with whole-word based and morphological models. The results show that the word processing costs estimated by the statistical Morfessor model are relevant for brain dynamics of reading during late processing stages.Entities:
Keywords: MEG; Morfessor; N400m; computational linguistics; computational modeling; language; morphology; orthography; surprisal
Mesh:
Year: 2018 PMID: 29524274 PMCID: PMC5969226 DOI: 10.1002/hbm.24025
Source DB: PubMed Journal: Hum Brain Mapp ISSN: 1065-9471 Impact factor: 5.038
Figure 1Experimental stimuli. Examples of the four functionally distinct stimulus categories: words, pseudowords, symbols strings, and (pseudo)words embedded in Gaussian random noise. Each trial consisted of a fixation cross that appeared for 500 ms, followed by a single stimulus that was displayed for 1,500 ms
Figure 2Source modeling. (a) The final source model for each participant consisted of four temporally, spatially and functionally identified ECDs. The locations of ECDs were identical across subjects, but the orientation was determined individually. (b) The ECD amplitude time courses averaged for each stimulus category and across participants highlight the distinct functional roles of the different ECDs. The dashed vertical line represents the time at which the ECD was localized
Correlations between predictor variables
| Image complexity | Word length | ‐log bigram frequency | TPL | Morfessor | ‐log Lemma frequency | ‐log Surface frequency | |
|---|---|---|---|---|---|---|---|
|
| 1 | 0.81** | −0.07 | −0.06 | 0.42** | 0.05 | 0.27** |
|
| 1 | 0.09 | −0.05 | 0.54** | 0.06 | 0.34** | |
|
| 1 | 0.06 | 0.06 | −0.08 | 0.09 | ||
|
| 1 | 0.01 | 0.61** | −0.10* | |||
|
| 1 | 0.38** | 0.74** | ||||
|
| 1 | 0.40** | |||||
|
| 1 |
*p < .05, **p < .001.
Correlation coefficients r of word measures to source component amplitudes and reaction times
| Predictor variables | Source component | Reaction times | |||
|---|---|---|---|---|---|
| Occipital | Occipito‐temporal | Left temporal | Right temporal | ||
|
| |||||
| Image complexity | 0.21** | −0.11* | 0.01 | 0.12* | 0.42** |
| Length | 0.31** | −0.13* | 0.07 | 0.25** | 0.57** |
| ‐log bigram frequency | 0.003 | −0.11* | −0.11* | −0.09 | −0.20** |
| TPL | 0.02 | −0.01 | 0.02 | 0.04 | 0.06 |
| Morfessor | 0.21** | −0.10* | 0.32** | 0.29** | 0.61** |
| ‐log lemma frequency | 0.01 | −0.03 | 0.21** | 0.17** | 0.32** |
| ‐log surface frequency | 0.11* | −0.08 | 0.35** | 0.21** | 0.55** |
|
| |||||
| Image complexity | 0.28** | −0.14* | 0.12* | 0.28** | 0.50** |
| Length | 0.39** | −0.16* | 0.15* | 0.33** | 0.64** |
| ‐log bigram frequency | −0.1 | −0.04 | 0.05 | −0.004 | 0.04 |
| Morfessor | 0.28** | −0.14* | 0.17* | 0.30** | 0.34** |
*p < .05, **p < .001.
Multiple regression β coefficients of predictors to source component amplitudes and reaction times. R 2 is the total variance explained by the complete model
| Predictor variables | Source component | Reaction times | |||
|---|---|---|---|---|---|
| Occipital | Occipito‐temporal | Left temporal | Right temporal | ||
|
| |||||
| Image complexity | −0.15 | −0.24 | |||
| Length | 0.30** | −0.15 | 0.33** | 0.36** | |
| ‐log bigram frequency | −0.14 | −0.11** | |||
| TPL | |||||
| Morfessor | 0.22 | 0.21** | 0.20 | ||
| ‐log lemma frequency | 0.14** | ||||
| ‐log surface frequency | 0.22 | 0.21** | |||
| Total | 0.09** | 0.04 | 0.15** | 0.12** | 0.52** |
|
| |||||
| Image complexity | |||||
| Length | 0.39** | −0.16 | 0.34** | 0.89** | |
| ‐log bigram frequency | |||||
| Morfessor | 0.17 | −0.34** | |||
| Total | 0.16** | 0.03 | 0.03** | 0.11** | 0.46** |
*p < .05, **p < .001.
Figure 3Visualization of how item‐level cortical activations are related to linguistic models, with the highest correlations displayed for each studied response type. The time course of activation is averaged in bins of 60 words (lowest, average, highest values of the model). The scatter plots depict the relative source amplitudes (averaged over the time window marked with gray in the time course) for individual words with respect to the linguistic model