| Literature DB >> 34322056 |
Abstract
Cross-linguistic studies focus on inverse correlations (trade-offs) between linguistic variables that reflect different cues to linguistic meanings. For example, if a language has no case marking, it is likely to rely on word order as a cue for identification of grammatical roles. Such inverse correlations are interpreted as manifestations of language users' tendency to use language efficiently. The present study argues that this interpretation is problematic. Linguistic variables, such as the presence of case, or flexibility of word order, are aggregate properties, which do not represent the use of linguistic cues in context directly. Still, such variables can be useful for circumscribing the potential role of communicative efficiency in language evolution, if we move from cross-linguistic trade-offs to multivariate causal networks. This idea is illustrated by a case study of linguistic variables related to four types of Subject and Object cues: case marking, rigid word order of Subject and Object, tight semantics and verb-medial order. The variables are obtained from online language corpora in thirty languages, annotated with the Universal Dependencies. The causal model suggests that the relationships between the variables can be explained predominantly by sociolinguistic factors, leaving little space for a potential impact of efficient linguistic behavior.Entities:
Keywords: causal networks; efficiency; object; subject; trade-offs
Year: 2021 PMID: 34322056 PMCID: PMC8311235 DOI: 10.3389/fpsyg.2021.648200
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
Languages in this study.
| Arabic | Semitic | Afro-Asiatic | arabic-padt-ud-2.4 |
| Bulgarian | Slavic | Indo-European | bulgarian-btb-ud-2.4 |
| Croatian | Slavic | Indo-European | croatian-set-ud-2.4 |
| Czech | Slavic | Indo-European | czech-pdt-ud-2.4 |
| Danish | Germanic | Indo-European | danish-ddt-ud-2.4 |
| Dutch | Germanic | Indo-European | dutch-alpino-ud-2.4 |
| English | Germanic | Indo-European | english-ewt-ud-2.4 |
| Estonian | Finnic | Uralic | estonian-edt-ud-2.4 |
| Finnish | Finnic | Uralic | finnish-tdt-ud-2.4 |
| French | Romance | Indo-European | french-gsd-ud-2.4 |
| German | Germanic | Indo-European | german-gsd-ud-2.4 |
| Greek (modern) | Greek | Indo-European | greek-gdt-ud-2.4 |
| Hindi | Indic | Indo-European | hindi-hdtb-ud-2.4 |
| Hungarian | Ugric | Uralic | hungarian-szeged-ud-2.4 |
| Indonesian | Malayo-Sumbawan | Austronesian | indonesian-gsd-ud-2.4 |
| Italian | Romance | Indo-European | italian-isdt-ud-2.4 |
| Japanese | Japanese | Japanese | japanese-gsd-ud-2.4 |
| Korean | Korean | Korean | korean-gsd-ud-2.4 |
| Latvian | Baltic | Indo-European | latvian-lvtb-ud-2.4 |
| Lithuanian | Baltic | Indo-European | lithuanian-hse-ud-2.4 |
| Persian | Iranian | Indo-European | persian-seraji-ud-2.4 |
| Portuguese | Romance | Indo-European | portuguese-bosque-ud-2.4 |
| Romanian | Romance | Indo-European | romanian-rrt-ud-2.4 |
| Russian | Slavic | Indo-European | russian-syntagrus-ud-2.4 |
| Slovenian | Slavic | Indo-European | slovenian-ssj-ud-2.4 |
| Spanish | Romance | Indo-European | spanish-gsd-ud-2.4 |
| Swedish | Germanic | Indo-European | swedish-talbanken-ud-2.4 |
| Tamil | Southern Dravidian | Dravidian | tamil-ttb-ud-2.4 |
| Turkish | Turkic | Altaic | turkish-imst-ud-2.4 |
| Vietnamese | Viet-Muong | Austro-Asiatic | vietnamese-vtb-ud-2.4 |
Frequencies of case forms in Spanish.
| Zero marking | 126,736 | 569,252 |
| Preposition | 0 | 55,442 |
Frequencies of case forms in Hindi.
| Absolutive (zero marking) | 46,241 | 363,647 |
| Ergative | 61,512 | 0 |
| Accusative | 0 | 92,510 |
Frequencies of case forms in Finnish (extrapolated).
| Nominative (zero marking) | 132,631 | 94,077 |
| Genitive + Partitive | 9,562 | 386,268 |
FIGURE 1Case marking (Mutual Information between Role and Case).
A fragment of the Lexeme – Role matrix for English.
| hunter | 40 | 22 |
| street | 34 | 466 |
| t-shirt | 3 | 118 |
FIGURE 2Semantic tightness (Mutual Information between Role and Lexeme).
FIGURE 3Rigidity of Subject – Object order (1 – entropy).
FIGURE 4Proportion of verb-medial clauses.
FIGURE 5Spearman’s correlation coefficients between pairs of variables, averaged across 1,000 simulations. Top: simple pairwise coefficients. Bottom: partial coefficients.
95% confidence intervals around Spearman’s correlation coefficients based on 1,000 simulations.
| Tight semantics | 0.484, 0.499 | ||
| Rigid word order | −0.670, −0.664 − | −0.162, −0.149 | |
| Verb between Subject and Object | −0.480, −0.469 − | −0.446, −0.440 − | 0.269, 0.281 |
FIGURE 6Causal network based on the FCI algorithm: Thickness of the edges reflects their frequencies in 1,000 samples.
Mean p-values of the edges in FCI.
| Case marking | 0.099 (0.002, 0.392) | 0.011 (0.001, 0.068) | 0.122 (0.027, 0.346) | |
| Tight semantics | 0.099 (0.002, 0.392) | 0.564 (0.109, 1) | 0.128 (0.021, 0.895) | |
| Rigid order | 0.011 (0.001, 0.068) | 0.564 (0.109, 1) | 0.319 (0.058, 0.750) | |
| Verb middle | 0.122 (0.027, 0.346) | 0.128 (0.021, 0.895) | 0.319 (0.058, 0.750) |