| Literature DB >> 23593040 |
Lutz Roewer1, Michael Nothnagel, Leonor Gusmão, Veronica Gomes, Miguel González, Daniel Corach, Andrea Sala, Evguenia Alechine, Teresinha Palha, Ney Santos, Andrea Ribeiro-Dos-Santos, Maria Geppert, Sascha Willuweit, Marion Nagy, Sarah Zweynert, Miriam Baeta, Carolina Núñez, Begoña Martínez-Jarreta, Fabricio González-Andrade, Elizeu Fagundes de Carvalho, Dayse Aparecida da Silva, Juan José Builes, Daniel Turbón, Ana Maria Lopez Parra, Eduardo Arroyo-Pardo, Ulises Toscanini, Lisbeth Borjas, Claudia Barletta, Elizabeth Ewart, Sidney Santos, Michael Krawczak.
Abstract
Numerous studies of human populations in Europe and Asia have revealed a concordance between their extant genetic structure and the prevailing regional pattern of geography and language. For native South Americans, however, such evidence has been lacking so far. Therefore, we examined the relationship between Y-chromosomal genotype on the one hand, and male geographic origin and linguistic affiliation on the other, in the largest study of South American natives to date in terms of sampled individuals and populations. A total of 1,011 individuals, representing 50 tribal populations from 81 settlements, were genotyped for up to 17 short tandem repeat (STR) markers and 16 single nucleotide polymorphisms (Y-SNPs), the latter resolving phylogenetic lineages Q and C. Virtually no structure became apparent for the extant Y-chromosomal genetic variation of South American males that could sensibly be related to their inter-tribal geographic and linguistic relationships. This continent-wide decoupling is consistent with a rapid peopling of the continent followed by long periods of isolation in small groups. Furthermore, for the first time, we identified a distinct geographical cluster of Y-SNP lineages C-M217 (C3*) in South America. Such haplotypes are virtually absent from North and Central America, but occur at high frequency in Asia. Together with the locally confined Y-STR autocorrelation observed in our study as a whole, the available data therefore suggest a late introduction of C3* into South America no more than 6,000 years ago, perhaps via coastal or trans-Pacific routes. Extensive simulations revealed that the observed lack of haplogroup C3* among extant North and Central American natives is only compatible with low levels of migration between the ancestor populations of C3* carriers and non-carriers. In summary, our data highlight the fact that a pronounced correlation between genetic and geographic/cultural structure can only be expected under very specific conditions, most of which are likely not to have been met by the ancestors of native South Americans.Entities:
Mesh:
Year: 2013 PMID: 23593040 PMCID: PMC3623769 DOI: 10.1371/journal.pgen.1003460
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Figure 1Origin of male native South American samples.
For each sampling site, its geographic location as well as the size (proportional to the circle area) and Y-SNP haplogroup composition of the respective sample are shown. Blue lines: major aquatic systems; dashed gray lines: current national boundaries.
Figure 2Multidimensional scaling (MDS) analysis of Y-STR genotypes.
Depicted are the first two RST-based MDS components, C1 and C2, as obtained for the small marker set. The center of each graph has been magnified for better resolution. A: grouping of sampling sites in AMOVA according to fine geographic clustering (A); B: AMOVA grouping according to broad geographic clustering (B); C: AMOVA grouping according to haplogroup affiliation; D: AMOVA grouping according to class of spoken language (excluding samples of individuals with unassigned language class).
Figure 3Spatial autocorrelation analysis of Y-STR genotypes.
The spatial autocorrelation analysis was based upon the small marker set. Bold line: all samples; dashed line: Q-M3 (Q1a3a) haplogroup carriers only; filled circles: significant autocorrelation (p<0.05); empty circles: non-significant autocorrelation.
Correlation between Y-SNP haplogroup and language class.
| Language class | Y-SNP Haplogroup | ||||
| C3* | Q1a3 | Q1a3a | Q1a3a - del | Q1a3a1 | |
| Andean | 11 | 5 | 150 | 0 | 0 |
| Chibchan-Paezan | 0 | 22 | 57 | 0 | 0 |
| Equatorial-Tucanoan | 0 | 12 | 387 | 0 | 0 |
| Ge-Pano-Carib | 0 | 19 | 296 | 6 | 6 |
| Wao-Tiriro isolate | 3 | 0 | 37 | 0 | 0 |
Figure 4Prevalence of Y-SNP haplogroup C-M217 (C3*) around the Pacific Ocean.
Light blue: previous studies; dark blue: present study; yellow: relative frequency of C-M217 (C3*) carriers.
Figure 5Median-joining network of 167 different Asian and American Y-STR haplotypes carrying Y-SNP haplogroup C3* (from this and previously published studies).
The median-joining network is based upon markers DYS19, DYS389I, DYS389II-DYS389I, DYS390, DYS391, DYS392, DYS393 and DYS439 (see Materials and Methods for details). ALA: Alaskan; KOR: Korean; CHI: Chinese, including Daur, Uygur, Manchu; MON: Mongolian, including Kalmyk, Tuva, Buryat; ANA: Anatolian; INDO: Vietnamese, Thai, Malaysian, Indonesian, Philippines; JAP: Japanese; TIB: Tibetan, Nepalese; ALT: Altaian, including Kazakh, Uzbek; SIB: Teleut, Khamnigan, Evenk, Koryak; ECU: Ecuadorian, including Waorani, Lowland Kichwa, COL: Colombia, including Wayuu; RUS: Russian.
Estimation of the TMRCA (in generations) of C3* Y chromosomes.
| BATWING model 0 | BATWING model 2 | |||
| Prior Nc (Na) | 380 | 2000 | 380 | 2000 |
| Haplogroup C3* chromosomes from present study | ||||
|
| 192 (127–292) | 206 (135–317) | 168 (113–252) | 176 (117–265) |
|
| 98 (64–118) | 108 (70–168) | 72 (49–108) | 77 (51–117) |
| Inclusion of a previously published C3* chromosome from Alaska | ||||
|
| 231 (158–341) | 249 (168–374) | 202 (140–294) | 214 (147–316) |
|
| 140 (95–163) | 156 (104–234) | 99 (69–145) | 108 (73–160) |
Prior Nc (Na): mean of the prior distribution adopted for randomly sampling the two population size parameters in BATWING (Nc for model 0, Na for model 2). The results were obtained from 2 million BATWING runs per combination of haplogroup, distribution parameter and population model. Given are the median and, in parentheses, the inter-quartile range of the TMRCA (in generations) and of the posterior mean of Nc and Na, each time taken over the last 1 million runs. The first million runs were used for burn-in.