| Literature DB >> 31848607 |
María C Ávila-Arcos1,2, Kimberly F McManus3,4, Karla Sandoval5, Juan Esteban Rodríguez-Rodríguez5, Viridiana Villa-Islas1, Alicia R Martin2, Pierre Luisi6,7, Rosenda I Peñaloza-Espinosa8, Celeste Eng9, Scott Huntsman9, Esteban G Burchard9, Christopher R Gignoux10, Carlos D Bustamante2, Andrés Moreno-Estrada5.
Abstract
Native American genetic variation remains underrepresented in most catalogs of human genome sequencing data. Previous genotyping efforts have revealed that Mexico's Indigenous population is highly differentiated and substructured, thus potentially harboring higher proportions of private genetic variants of functional and biomedical relevance. Here we have targeted the coding fraction of the genome and characterized its full site frequency spectrum by sequencing 76 exomes from five Indigenous populations across Mexico. Using diffusion approximations, we modeled the demographic history of Indigenous populations from Mexico with northern and southern ethnic groups splitting 7.2 KYA and subsequently diverging locally 6.5 and 5.7 KYA, respectively. Selection scans for positive selection revealed BCL2L13 and KBTBD8 genes as potential candidates for adaptive evolution in Rarámuris and Triquis, respectively. BCL2L13 is highly expressed in skeletal muscle and could be related to physical endurance, a well-known phenotype of the northern Mexico Rarámuri. The KBTBD8 gene has been associated with idiopathic short stature and we found it to be highly differentiated in Triqui, a southern Indigenous group from Oaxaca whose height is extremely low compared to other Native populations.Entities:
Keywords: Native Americans; adaptive evolution; demographic inference; exome sequencing
Year: 2020 PMID: 31848607 PMCID: PMC7086176 DOI: 10.1093/molbev/msz282
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
FSampling locations and inferred demographic model for NM populations. Inferred split times are shown on the demographic model and effective population sizes (Ne) are shown on the map. Each branch represents one of the populations used in the demographic inference; colors correspond to those shown in the map displaying the sampling locations of the participant NM. The Nahua were not included in the demographic inference (see Discussion).
FGenes likely under adaptive evolution in NM. (a) Distribution of gene-based PBS values. Genes with extreme PBS values (99.9th percentile) are highlighted in blue. The x-axis shows the value of the PBS and the y-axis represents the frequency at which that value was observed in all NM. (b) Distribution of gene-based PBS values as a function of the number of SNPs per gene. Colors represent different SNP/gene bins, from one to ten SNPs/gene. Same genes as in (a) are shown, illustrating that they have PBS values at are at the top of their respective bin category.
FPBS as a function of the number of SNPs in each gene for (a) HUI, (b) MYA, (c) TAR, and (d) TRQ. Colors and symbols represent the population used as the second population for the computation of the PBS. Genes under likely adaptive evolution are shown in bold with their corresponding P-value (the few genes with low PBS scores that seem to appear as bold are not candidate genes, but rather multiple data points overlapping on the same position of the plot). A tree in each panel illustrates the topology used for the PBS calculation. CHB stands for Han Chinese; this population was used as the third, distantly related, population in all comparisons.