| Literature DB >> 24705626 |
Daniela Barbara Keller1, Jörg Schultz1.
Abstract
Words are built from smaller meaning bearing parts, called morphemes. As one word can contain multiple morphemes, one morpheme can be present in different words. The number of distinct words a morpheme can be found in is its family size. Here we used Birth-Death-Innovation Models (BDIMs) to analyze the distribution of morpheme family sizes in English and German vocabulary over the last 200 years. Rather than just fitting to a probability distribution, these mechanistic models allow for the direct interpretation of identified parameters. Despite the complexity of language change, we indeed found that a specific variant of this pure stochastic model, the second order linear balanced BDIM, significantly fitted the observed distributions. In this model, birth and death rates are increased for smaller morpheme families. This finding indicates an influence of morpheme family sizes on vocabulary changes. This could be an effect of word formation, perception or both. On a more general level, we give an example on how mechanistic models can enable the identification of statistical trends in language change usually hidden by cultural influences.Entities:
Mesh:
Year: 2014 PMID: 24705626 PMCID: PMC3976386 DOI: 10.1371/journal.pone.0093978
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1A general scheme of the BDI model for morpheme family distributions.
Adopted from [34].
Figure 2Current English (BNCbaby) with fitted power law (green), simple BDIM (orange), solb BDIM (red) and folb BDIM (blue) to the middle section [5,120]; Word family distribution in double logarithmic scale.
AIC, BIC and P-values of chi square goodness of fit tests for all investigated models.
| Power Law | simple BDIM | solb BDIM | folb BDIM | |||
| Adelung | German 18th | AIC | 888.02 | 816.45 |
| 805.91 |
| BIC | 896.28 | 824.71 |
| 819.68 | ||
| Chi2 | <10−74 | <10−6 |
| 0.4201 | ||
| WDG | German 20th | AIC | 1084.84 | 933.90 |
| 882.73 |
| BIC | 1093.58 | 942.64 |
| 897.30 | ||
| Chi2 | <10−172 | <10−12 |
| 0.0383 | ||
| BLL | German 20th | AIC | 1248.20 | 1137.54 |
| 1057.12 |
| BIC | 1257.35 | 1146.69 |
| 1072.37 | ||
| Chi2 | <10−75 | <10−37 | 0.2549 |
| ||
| Johnson | English 18th | AIC | 727.73 | 665.38 | 654.17 |
|
| BIC | 735.42 | 673.07 |
| 666.18 | ||
| Chi2 | <10−48 | <10−15 | 0.0352 |
| ||
| Webster | English beg. 20th | AIC | 762.28 | 650.06 |
| 653.36 |
| BIC | 769.97 | 657.76 |
| 666.18 | ||
| Chi2 | <10−91 | <10−14 | 0.0156 |
| ||
| BNCbaby | English end 20th | AIC | 897.26 | 779.25 | 744.72 |
|
| BIC | 905.52 | 787.51 |
| 758.24 | ||
| Chi2 | <10−111 | <10−6 |
| 0.7068 |
For AIC and BIC, lower values mean better fit. In the case of the chi square test, not significant p-values (>0.01) indicate a good fit of the model.Best fitting models are indicated in bold.