| Literature DB >> 34196708 |
Linda Ongaro1, Mayukh Mondal1, Rodrigo Flores1, Davide Marnetto1, Ludovica Molinaro1, Marta E Alarcón-Riquelme2, Andrés Moreno-Estrada3, Nedio Mabunda4, Mario Ventura5, Kristiina Tambets1, Garrett Hellenthal6, Cristian Capelli7,8, Toomas Kivisild9, Mait Metspalu1, Luca Pagani1,10, Francesco Montinaro1,5.
Abstract
American populations are one of the most interesting examples of recently admixed groups, where ancestral components from three major continental human groups (Africans, Eurasians and Native Americans) have admixed within the last 15 generations. Recently, several genetic surveys focusing on thousands of individuals shed light on the geography, chronology and relevance of these events. However, even though gene flow could drive adaptive evolution, it is unclear whether and how natural selection acted on the resulting genetic variation in the Americas. In this study, we analysed the patterns of local ancestry of genomic fragments in genome-wide data for ~ 6000 admixed individuals from 10 American countries. In doing so, we identified regions characterized by a divergent ancestry profile (DAP), in which a significant over or under ancestral representation is evident. Our results highlighted a series of genomic regions with DAPs associated with immune system response and relevant medical traits, with the longest DAP region encompassing the human leukocyte antigen locus. Furthermore, we found that DAP regions are enriched in genes linked to cancer-related traits and autoimmune diseases. Then, analysing the biological impact of these regions, we showed that natural selection could have acted preferentially towards variants located in coding and non-coding transcripts and characterized by a high deleteriousness score. Taken together, our analyses suggest that shared patterns of post admixture adaptation occurred at a continental scale in the Americas, affecting more often functional and impactful genomic variants.Entities:
Mesh:
Year: 2021 PMID: 34196708 PMCID: PMC8561420 DOI: 10.1093/hmg/ddab177
Source DB: PubMed Journal: Hum Mol Genet ISSN: 0964-6906 Impact factor: 6.150
Admixed American populations under study. N refers to the number of individuals included in the dataset
| Population | N | Country | References |
|---|---|---|---|
| ACB | 68 | Barbados | 1000 Genomes Project |
| AfroAme | 2004 | USA | IlluminaI Control Database |
| Argentina | 133 | Argentina | Lopez Herráez |
| ASW | 55 | SW_USA | 1000 Genomes Project |
| Caribbean | 1112 | Puerto Rico and Dominican Republic | Ghani |
| Chile | 25 | Chile | Lopez Herráez |
| CLM | 72 | Colombia | 1000 Genomes Project |
| Colombian | 26 | Colombia | Bryc |
| Dominican | 27 | Dominican Republic | Bryc |
| Ecuadorian | 19 | Ecuador | Bryc |
| EuroAme | 1562 | USA | IlluminaI Control Database |
| Maya | 25 | Mexico | Moreno-Estrada |
| Mayas | 21 | Mexico | Li |
| Mexican | 364 | Mexico | Moreno-Estrada |
| MXL | 63 | Mexico | 1000 Genomes Project |
| PEL | 68 | Peru | 1000 Genomes Project |
| Peru | 85 | Peru | Lopez Herráez |
| Puerto | 26 | Puerto Rico | Bryc |
| PUR | 73 | Puerto Rico | 1000 Genomes Project |
Figure 1
An overview of divergent ancestry profile regions inferred by local ancestry profiles for all ancestries. (A) Population distribution of DAPs. The x axis shows DAPS in single populations while y axis shows DAP sharing among groups. (B) Genomic location of shared DAPs. Different colours refer to the ancestry and direction of the divergence, as indicated in the legend.
Details about inferred genomic regions with divergent ancestry profiles. ‘Wind’ refers to the number of DAP 100 kb windows inside the genomic region; ‘SNPs’ shows the number of SNPs contained in the genomic region; ‘Ancestry’ refers to the specific ancestry for which we found an underrepresented (−) or overrepresented (+) DAP window or region; ‘Type’ describes the DAP regions as Inter-Samples (IS-DAP) or Inter-Populations (IP-DAP); ‘Populations (Z)’ reports Z scores for the significant populations; ‘N genes’ shows the number of genes related to SNPs part of the genomic region; ‘Genes’ refers to the Gene/clone identifiers provided in ENSEMBL annotation
| Chr | Start | End | Wind | SNPs | Length (kb) | Dataset | Ancestry | Type | Populations (Z) | N genes | Genes |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 4 | 24 512 590 | 25 679 209 | 9 | 91 | 1167 | 1090Ind | -America | IS-DAP | PEL (Z = –3.1) Peru (Z Min = −3.1, Max = −3.5) | 15 | MIR573;DHX15;RP11-496D24.2;AC006390.4;SOD3;CCDC149;SEPSECS;PI4K2B;ZCCHC4;AC108218.1;RP11-206P5.1;RP11-302F12.3;RP11-302F12.2;SLC34A2;RP11-302F12.1 |
| 6 | 25 312 755 | 33 098 966 | 68 | 1079 | 7786 | 20Pop | -Europe | IP-DAP | Caribbean (Z Min = −3.1, Max = −6.5) EuroAme (Z Min = −3, Max = −7) | 373 | View |
| 6 | 34 102 061 | 35 495 811 | 8 | 64 | 1394 | 20Pop, 1090Ind | -America | IP-DAP | Mexican (Z Min = −3.1, Max = −3.4) PEL (Z Min = −4.3, Max = −4.8) | 20 | GRM4;PACSIN1;SPDEF;RP3-391O22.3;C6orf106;RPL7P25;UHRF1BP1;TAF11;ANKS1A;HSPE1P11;TCP11;SCUBE3;RP3-329A5.1;ZNF76;DEF6;MKRNP2;FANCE;RPL10A;TEAD3;TULP1 |
| 8 | 121 904 711 | 122 498 253 | 6 | 66 | 594 | 20Pop | +Europe | IP-DAP | Dominican (Z = 3.1) Puerto (Z = 3) | 5 | RP11-369 K17.1;AC011626.1;RP11-369 K17.2;AC027238.1;RPL35AP19 |
| 8 | 10 706 801 | 11 499 967 | 8 | 108 | 793 | 20Pop | -Europe | IP-DAP | Caribbean (Z Min = −3.1, Max = −3.4) EuroAme (Z Min = −9, Max = −9.5) | 23 | RP11-177H2.2;XKR6;MIR598;AF131215.6;AF131215.9;AF131215.2;AF131215.4;AF131215.8;LINC00529;RPL19P13;MMR9;AF131216.6;SLC35G5;TDH;AF131216.5;C8orf12;RN7SL293P;RNU6-1084P;FAM167A;BLK;RP11-148O21.3;RP11-148O21.6;LINC00208 |
| 9 | 38 615 175 | 38 771 831 | 2 | 44 | 157 | 20Pop | -Europe | IP-DAP | AfroAme (Z Min = −3, Max = −3.6) EuroAme (Z Min = −5.2, Max = −6.5) Mexican (Z Min = −3.2, Max = −3.3) | 5 | GLIS3;ANKRD18A;FAM201A;RP13-198D9.3;RNU6-765P |
| 9 | 38 615 175 | 38 771 831 | 2 | 44 | 157 | 1090Ind | -Europe | IP-DAP | AfroAme (Z Min = −3.2, Max = −3.4) Mexican (Z Min = −3.2, Max = −3.3) | 5 | GLIS3;ANKRD18A;FAM201A;RP13-198D9.3;RNU6-765P |
| 11 | 43 908 230 | 44 296 591 | 4 | 68 | 388 | 1090Ind | +America | IS-DAP | Caribbean (Z = 3) PUR (Z Min = 3.1, Max = 3.3) | 12 | OR52B3P;TRIM21;ALKBH3;C11orf96;RP11-613D13.4;ALKBH3-AS1;RP11-613D13.8;ACCSL;ACCS;CTD-2609 K8.3;EXT2;ALX4 |
| 11 | 46 210 259 | 46 297 631 | 1 | 10 | 87 | 1090Ind | +America | IS-DAP | Caribbean (Z = 3.1) PUR (Z = 3.1) | 5 | TRIM68;RP11-702F3.4;CTD-2589 M5.5;CTD-2589 M5.4;CREB3L1 |
| 11 | 56 000 288 | 56 184 888 | 2 | 21 | 185 | 1090Ind | +America | IS-DAP | Caribbean (Z = 3.1) PUR (Z Min = 3.1, Max = 3.3) | 12 | OR5T2;OR8K3;OR8K2P;FAM8A2P;OR8K1;OR8J1;RPL5P29;OR8U1;OR8L1P;OR5AL2P;OR5AL1;OR5R1 |
| 13 | 19 612 262 | 19 690 836 | 1 | 9 | 79 | 20Pop | -Europe | IP-DAP | AfroAme (Z = –3.3) Caribbean (Z = –3.3) EuroAme (Z Min = −14.4, Max = −16) | 2 | PHF2P2;RNA5SP24 |
| 14 | 20 445 618 | 20 697 600 | 3 | 46 | 252 | 20Pop | -Europe | IP-DAP | Caribbean (Z Min = −3, Max = −3.1) EuroAme (Z Min = −6.1, Max = −8) | 17 | OR4K15;OR4Q2;OR4K14;OR4K13;AL359218.1;OR4U1P;OR4L1;RNA5SP380;OR4T1P;OR4K17;OR11G1P;RP11-98 N22.6;OR11G2;OR11H5P;OR11H6;AL356019.1;OR11H7 |
| 15 | 22 837 143 | 23 975 482 | 6 | 69 | 1138 | 20Pop | -Europe | IP-DAP | AfroAme (Z Min = −3.4, Max = −9.2) Argentina (Z Min = −4.9, Max = −5.2) Caribbean (Z Min = −7.1, Max = −8.2) EuroAme (Z Min = −4.2, Max = −35.1) Mexican (Z Min = −4.8, Max = −5.4) MXL (Z Min = −3.2, Max = −3.6) PUR (Z Min = −4.3, Max = −4.8) | 7 | TUBGCP5;CYFIP1;NIPA2;NIPA1;MIR4508;MKRN3;RP11-73C9.1 |
| 15 | 22 837 143 | 23 053 839 | 3 | 43 | 217 | 1090Ind | -Europe | IP-DAP | AfroAme (Z Min = −7.2, Max = −8.4) Argentina (Z Min = −4.8, Max = −5) Caribbean (Z Min = −6.1, Max = −7.1) EuroAme (Z Min = −3.4, Max = −3.9) Mexican (Z Min = −4.8, Max = −5.3) MXL (Z Min = −3.2, Max = −3.6) PUR (Z Min = −4.3, Max = −4.8) | 4 | TUBGCP5;CYFIP1;NIPA2;NIPA1 |
| 15 | 65 613 654 | 65 695 283 | 1 | 10 | 82 | 1090Ind | -Africa | IP-DAP | ACB (Z = –3.1) Dominican (Z = –3) | 2 | IGDCC3;IGDCC4 |
| 21 | 15 412 399 | 15 599 963 | 2 | 17 | 188 | 20Pop | -Europe | IP-DAP | Caribbean (Z = –3.2) EuroAme (Z Min = −8.2, Max = −9.5) | 6 | AP001347.6;RNA5SP488;ANKRD20A18P;LIPI;ERLEC1P1;RBM11 |
Figure 2
Comparison of the distribution of all the PHRED-scaled C-score values belonging to windows with divergent ancestry profiles (DAPs) with the ones from the non-divergent for European and American ancestries in the 20Pop dataset. The asterisk refers to a statistically significant P-value (Wilcoxon test, Bonferroni corrected alpha = 0.01). The number of analysed windows is reported in Supplementary Material, Table S5A.
Figure 3
Comparison of the distribution of Annotypes (Coding Transcript, Intergenic, Non-coding Transcript, Regulatory Feature, Transcript) belonging to divergent ancestry profiles (DAPs) windows with the ones from the non-divergent for European (A) and American (B) ancestries in the 20Pop dataset. The asterisk refers to a statistically significant P-value (Wilcoxon test, Bonferroni corrected alpha = 0.002). The number of analysed windows is reported in Supplementary Material, Table S5B.
Figure 4
Gene-set enrichment analysis results for the European ancestry of the 20Pop dataset (Supplementary Material, Table S6A–D). Only the first 10 enriched terms for library are shown. Libraries: Genome-wide Association Studies (GWAS) Catalog 2019, GTEx Tissue Sample Gene Expression Profiles up, Human 2019 Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) 2018.