| Literature DB >> 31350885 |
Chiara Barbieri1,2, Rodrigo Barquera3, Leonardo Arias4, José R Sandoval5, Oscar Acosta5, Camilo Zurita6,7, Abraham Aguilar-Campos8, Ana M Tito-Álvarez9, Ricardo Serrano-Osuna8, Russell D Gray1, Fabrizio Mafessoni4, Paul Heggarty1, Kentaro K Shimizu2, Ricardo Fujita5, Mark Stoneking4, Irina Pugach4, Lars Fehren-Schmitz10,11.
Abstract
Studies of Native South American genetic diversity have helped to shed light on the peopling and differentiation of the continent, but available data are sparse for the major ecogeographic domains. These include the Pacific Coast, a potential early migration route; the Andes, home to the most expansive complex societies and to one of the most widely spoken indigenous language families of the continent (Quechua); and Amazonia, with its understudied population structure and rich cultural diversity. Here, we explore the genetic structure of 176 individuals from these three domains, genotyped with the Affymetrix Human Origins array. We infer multiple sources of ancestry within the Native American ancestry component; one with clear predominance on the Coast and in the Andes, and at least two distinct substrates in neighboring Amazonia, including a previously undetected ancestry characteristic of northern Ecuador and Colombia. Amazonian populations are also involved in recent gene-flow with each other and across ecogeographic domains, which does not accord with the traditional view of small, isolated groups. Long-distance genetic connections between speakers of the same language family suggest that indigenous languages here were spread not by cultural contact alone. Finally, Native American populations admixed with post-Columbian European and African sources at different times, with few cases of prolonged isolation. With our results we emphasize the importance of including understudied regions of the continent in high-resolution genetic studies, and we illustrate the potential of SNP chip arrays for informative regional-scale analysis.Entities:
Keywords: Native American population genetics; South American prehistory; admixture; human migration; runs of homozygosity
Mesh:
Year: 2019 PMID: 31350885 PMCID: PMC6878948 DOI: 10.1093/molbev/msz174
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
. 1.Map showing the approximate sampling locations of the newly reported population samples from South America, together with the ADMIXTURE results for K = 8. On top of the ADMIXTURE plot, newly reported population samples (in boldface) are shown together with other Native American samples from the literature, similarly typed with the Human Origins Affymetrix array. Yoruba and Spanish were also included in the ADMIXTURE runs to visualize African and European admixture.
. 2.Principal component analysis of the newly reported samples together with representative populations from North and South America. (A) First and second dimension. (B) First and third dimension. (C) First and fourth dimension. PCA was run with a subset of 2,545 SNPs previously defined as ascertained with Karitiana (see Materials and Methods). Color legend corresponds to geographic grouping. Three Cocama-speaking individuals from the “LoretoMix” population are marked with a red asterisk in the first PCA panel and discussed in the section on IBD analysis.
. 3.Distribution of ROH classes. ROH analyses are run on a pruned data set of 232,755 SNPs to avoid tracts affected by linkage disequilibrium. Classes of ROH are identified following Pemberton et al. (2012). (A) Proportion of small and large ROH classes for each individual. (B) ROH length classes profiles per groups, showing the variance of total length of ROH per each individual, binned for six length classes.
. 4.Results of the IBD sharing analysis. (A) Symmetrical matrix of pairwise IBD blocks sharing, showing the total length and the number of occurrences adjusted by population size. Populations are ordered by ecoregion and color-coded as in figure 2. (B) Map visualizing the connections between populations that share blocks with each other: thin yellow lines indicate the lowest levels of exchange, thick red lines the highest (adjusted for population size). Only blocks larger than 5 cM are considered.
. 5.Admixture dates between European and African sources. Estimates of admixture are calculated with the MALDER and WAVELETS methods. Dates are expressed in generations ago and converted to calendar years using a generation time of 29 years.