| Literature DB >> 23696865 |
Carlos Eduardo Guerra Amorim1, Rafael Bisso-Machado, Virginia Ramallo, Maria Cátira Bortolini, Sandro Luis Bonatto, Francisco Mauro Salzano, Tábita Hünemeier.
Abstract
The relationship between the evolution of genes and languages has been studied for over three decades. These studies rely on the assumption that languages, as many other cultural traits, evolve in a gene-like manner, accumulating heritable diversity through time and being subjected to evolutionary mechanisms of change. In the present work we used genetic data to evaluate South American linguistic classifications. We compared discordant models of language classifications to the current Native American genome-wide variation using realistic demographic models analyzed under an Approximate Bayesian Computation (ABC) framework. Data on 381 STRs spread along the autosomes were gathered from the literature for populations representing the five main South Amerindian linguistic groups: Andean, Arawakan, Chibchan-Paezan, Macro-Jê, and Tupí. The results indicated a higher posterior probability for the classification proposed by J.H. Greenberg in 1987, although L. Campbell's 1997 classification cannot be ruled out. Based on Greenberg's classification, it was possible to date the time of Tupí-Arawakan divergence (2.8 kya), and the time of emergence of the structure between present day major language groups in South America (3.1 kya).Entities:
Mesh:
Year: 2013 PMID: 23696865 PMCID: PMC3656118 DOI: 10.1371/journal.pone.0064099
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Alternative demographic models tested against the genetic variation in 381 autosomal STRs.
Parameters are explained in Table 1. Current average deme size (NP) and gene flow (m) between populations are not shown.
Prior distributions of selected model parameters.
| Parameter | Distribution | Range | References |
| T0 – Time for the onset of expansion | Uniform | 10,000–19,000 |
|
| T1 –Time for the first emergence of structure | Uniform | 800–6,400 |
|
| T2 –Time for the second emergence of structure | Uniform | 800–6,400 |
|
| N0 – Ancestral effective population size | Uniform | 2–1,000 |
|
| N1 – Effective population (continental) size | Uniform | 1,000–100,000 |
|
| NP – Current effective deme size | Gamma (10, 10/NP) | 50–1,000 |
|
| m- symmetric migration rate | Uniform | 0.00001–0.001. |
|
Time is given in years before present and effective population size in number of diploid individuals (2n). T1 and T2 prior distributions may present deviations from uniformity, since T1>T2.
Posterior probability of three linguistic classifications for South American languages given the genetic diversity of 381 autosomal STRs.
| Linguistic classification | Posterior probability (%) | |
| Method I | Method II | |
| Campbell | 40.3 | 43.0 |
| Greenberg | 59.1 | 51.0 |
| Loukotka | 00.6 | 6.0 |
Figure 2Prior (black), posterior (red) and retained (blue) simulations distributions of time (in generations) and size (2n) of parameters of the demographic model based on Greenberg's [22] language classification.
Posterior characteristics of the parameters of the model designed based on Greenberg's [22] classification given the genetic diversity of 381 autosomal STRs.
| Parameter | Posterior distribution | Estimation accuracy | |||||
| R | RMSE | P-value | |||||
| Mode | Median | HPDI | Mode | Median | |||
| T0 | 10,905 | 14,040 | 10,136–18,683 | 0.00 | 3,625 | 2,675 | 0.00 |
| T1 | 2,779 | 3,094 | 1,480–5,294 | 0.03 | 1,300 | 1,000 | 0.05 |
| T2 | 2,666 | 2,812 | 800–4,382 | 0.40 | 925 | 850 | 0. 71 |
| N0 | 52 | 419 | 2–985 | 0.00 | 423 | 292 | 0.47 |
| N1 | 19,905 | 45,852 | 2,492–96,020 | 0.00 | 40,407 | 28,474 | 0.92 |
| NP | 967 | 912 | 709–1,000 | 0.74 | 117 | 106 | 0.57 |
Highest posterior density interval, which is the continuous interval of parameter values with highest posterior density.
Coefficient of determination (R2) obtained when regressing the parameter against the summary-statistics.
Root mean squared error.
P-value considering Kolmogorov-Smirnoff's test for uniformity of posterior quantiles.
Figure 3Quantile distributions (x-axis) of the known parameter values as inferred from the posterior distributions for 1,000 pseudo-observed data sets generated under Greenberg's [22] model.