| Literature DB >> 33281652 |
Yohei Oseki1,2, Alec Marantz2,3,4.
Abstract
One of the central debates in the cognitive science of language has revolved around the nature of human linguistic competence. Whether syntactic competence should be characterized by abstract hierarchical structures or reduced to surface linear strings has been actively debated, but the nature of morphological competence has been insufficiently appreciated despite the parallel question in the cognitive science literature. In this paper, in order to investigate whether morphological competence should be characterized by abstract hierarchical structures, we conducted a crowdsourced acceptability judgment experiment on morphologically complex words and evaluated five computational models of morphological competence against human acceptability judgments: Character Markov Models (Character), Syllable Markov Models (Syllable), Morpheme Markov Models (Morpheme), Hidden Markov Models (HMM), and Probabilistic Context-Free Grammars (PCFG). Our psycholinguistic experimentation and computational modeling demonstrated that "morphous" computational models with morpheme units outperformed "amorphous" computational models without morpheme units and, importantly, PCFG with hierarchical structures most accurately explained human acceptability judgments on several evaluation metrics, especially for morphologically complex words with nested morphological structures. Those results strongly suggest that human morphological competence should be characterized by abstract hierarchical structures internally generated by the grammar, not reduced to surface linear strings externally attested in large corpora.Entities:
Keywords: acceptability; computational modeling; grammaticality; morphology; probability; psycholinguistics
Year: 2020 PMID: 33281652 PMCID: PMC7688581 DOI: 10.3389/fpsyg.2020.513740
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
Novel morphologically complex words unattested with zero surface frequencies and trimorphemic with linear and nested morphological structures: 300 linear words (with two inner and outer suffixes) and 300 nested words (with inner suffixes and outer prefixes), hence 600 words in total.
Figure 1Descriptive statistics of the acceptability judgment experiment. The x-axis represents individual acceptability judgments z-score transformed for each participant, while the y-axis shows probability densities. Descriptive statistics are separated into linear (blue) and nested (red) structures.
Effect accuracies of computational models.
| Human | 4.67 | 4.39 | 3.39 | <0.001 | 0.28 | — |
| Character | −6.17 | −6.31 | 0.63 | ns | 0.05 | 0.23 |
| Syllable | −1.96 | −2.22 | 0.98 | ns | 0.08 | 0.20 |
| Morpheme | 2.15 | 1.47 | 9.08 | <0.001 | 0.74 | 0.46 |
| HMM | −0.85 | −1.47 | 11.51 | <0.001 | 0.94 | 0.66 |
| PCFG | 1.35 | 1.18 | 2.68 | <0.01 | 0.22 |
Mean acceptability judgments of linear and nested morphological structures, t-values, p-values, Cohen's d, and effect accuracies (i.e., absolute differences in Cohen's d from human acceptability judgments) are presented for each computational model;
p < 0.05,
**p < 0.01,
p < 0.001; Bold value represents best performance.
Figure 2Deviance accuracies of computational models. The x-axis represents computational models, while the y-axis shows deviance accuracies (i.e., decreases in deviance statistics from the baseline model). Colors indicate computational models: blue = Character Markov Model, orange = Syllable Markov Model, yellow = Morpheme Markov Model, green = Hidden Markov Model, brown = Probabilistic Context-Free Grammar. The horizontal dashed line is χ2 = 3.84, the critical χ2-statistic at p = 0.05 with df = 1.
Figure 3Residual accuracies of computational models. The x-axis represents computational models, while the y-axis shows residual accuracies (i.e., decreases in absolute residual errors from the baseline model). Residual accuracies are categorized into linear (left) and nested (right) morphological structures. The horizontal dashed line is a “tie” borderline where computational models make the same predictions as the baseline model. Positive and negative residual accuracies mean better and worse predictions relative to the baseline model, respectively.