| Literature DB >> 35508562 |
Kenneth K Kidd1, Baigalmaa Evsanaa2, Ariunaa Togtokh2, Jane E Brissenden3, Janet M Roscoe4,5, Mustafa Dogan6, Pavlos I Neophytou7, Cemal Gurkan8,9, Ozlem Bulbul10, Lotfi Cherni11,12, William C Speed13, Michael Murtha13, Judith R Kidd13, Andrew J Pakstis13.
Abstract
Population genetic studies of North Asian ethnic groups have focused on genetic variation of sex chromosomes and mitochondria. Studies of the extensive variation available from autosomal variation have appeared infrequently. We focus on relationships among population samples using new North Asia microhaplotype data. We combined genotypes from our laboratory on 58 microhaplotypes, distributed across 18 autosomes, on 3945 individuals from 75 populations with corresponding data extracted for 26 populations from the Thousand Genomes consortium and for 22 populations from the GenomeAsia 100 K project. A total of 7107 individuals in 122 total populations are analyzed using STRUCTURE, Principal Component Analysis, and phylogenetic tree analyses. North Asia populations sampled in Mongolia include: Buryats, Mongolians, Altai Kazakhs, and Tsaatans. Available Siberians include samples of Yakut, Khanty, and Komi Zyriane. Analyses of all 122 populations confirm many known relationships and show that most populations from North Asia form a cluster distinct from all other groups. Refinement of analyses on smaller subsets of populations reinforces the distinctiveness of North Asia and shows that the North Asia cluster identifies a region that is ancestral to Native Americans.Entities:
Mesh:
Year: 2022 PMID: 35508562 PMCID: PMC9068624 DOI: 10.1038/s41598-022-10706-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
North Asian population samples studied in the Kidd Lab.
The list of populations and sample sizes ordered by broad geographic regions.
| Region | Source | N | Population | Abrv | Region | Source | N | Population | Abrv |
|---|---|---|---|---|---|---|---|---|---|
| Cen.Africa | KL | 69 | Biaka | BIA | SoCenAsia | 1 KG | 102 | Telugu | ITU |
| KL | 38 | Mbuti | MBU | KL | 30 | Keralites | KER | ||
| KL | 8 | Lisongo | LIS | AG | 34 | Urban Chennai, India | CNI | ||
| W.Africa | 1 KG | 113 | Gambians | GWD | AG | 34 | Urban Bangalore, India | BGL | |
| 1 KG | 85 | Mende,SierraLeone | MSL | AG | 17 | Lambada, India | LMB | ||
| 1 KG | 99 | Esan | ESN | AG | 19 | Mahar, India | MHR | ||
| 1 KG | 108 | Yoruba, Ibadan | YRI | AG | 17 | Agharia, India | AGH | ||
| KL | 77 | Yoruba, Benin City | YOR | AG | 20 | Toda, India | TOD | ||
| KL | 48 | Ibo | IBO | KL | 13 | Thoti | THT | ||
| KL | 39 | Hausa | HSA | 1 KG | 102 | Tamil, SriLanka | STU | ||
| E.Africa | 1 KG | 99 | Luhya, Kenya | LWK | 1 KG | 86 | Bengali, Bangladesh | BEB | |
| AG | 17 | Masai, Kenya | MKK | KL | 17 | Kachari | KCH | ||
| KL | 20 | Masai, Tanzania | MAS | AG | 15 | Oraon, India | ORA | ||
| KL | 45 | Chagga | CGA | AG | 17 | Konda Reddy, India | KND | ||
| KL | 40 | Sandawe | SND | AG | 20 | Birhor, India | BIR | ||
| KL | 38 | Zaramo | ZRM | AG | 15 | Onge, India | ONG | ||
| Admixed | 1 KG | 96 | Afro-Caribbeans | ACB | AG | 15 | Hazara, India | HZA | |
| Admixed | 1 KG | 61 | AfrAmer, SW | ASW | AG | 20 | Mog, India | MOG | |
| Admixed | KL | 89 | AfrAmericans | AAM | North Asia | KL | 45 | Khanty | KTY‡ |
| KL | 32 | Ethiopians | ETJ | KL | 57 | Altai Kazakhs | AKZ‡ | ||
| N.Africa | KL | 86 | So.Tunisians | TNS | KL | 55 | Mongolians | OMG‡ | |
| S.W.Asia | KL | 40 | Yemenites | YMJ | KL | 43 | Tsaatan | TSA‡ | |
| KL | 79 | Saudi | SAU | AG | 87 | Buryat | BUR‡ | ||
| KL | 15 | Kuwaiti | KWT | KL | 51 | Yakut | YAK‡ | ||
| KL | 62 | Palestinian Arabs | PLA | East Asia | KL | 54 | Koreans | KOR‡ | |
| KL | 101 | Druze | DRU | AG | 150 | Koreans | KRE‡ | ||
| KL | 39 | Samaritans | SAM | KL | 48 | Japanese | JPN‡ | ||
| KL | 20 | Chaldeans | CHL | 1 KG,AG | 134 | Japanese,Tokyo | JPT | ||
| KL | 8 | Shabaks | SHB | 1 KG | 103 | Han Chinese, Beijing | CHB‡ | ||
| KL | 117 | Syriacs | SYR | 1 KG | 105 | So. Han Chinese | HCS | ||
| KL | 140 | Yazidi | YZD | KL | 48 | Chinese, Taiwan | CHT | ||
| KL | 133 | Kurds | KRD | KL | 41 | Hakka | HKA | ||
| KL | 113 | Turkmen | TKM | KL | 58 | Chinese,SanFrancisco | CHS | ||
| KL | 114 | Arabs, N.Iraq | NIA | KL | 39 | Ami | AMI | ||
| KL | 39 | Iranians | IRN | KL | 42 | Atayal | ATL | ||
| KL | 82 | Turkish | TRK | 1 KG | 93 | Dai | CDX | ||
| Europe | KL | 50 | Turkish Cypriots | TCP | 1 KG | 99 | Vietnamese | KHV | |
| KL | 89 | Greek Cypriots | GCP | KL | 118 | Laotians | LAO | ||
| KL | 54 | Adygei | ADY | KL | 24 | Cambodians | CBD | ||
| KL | 78 | Ashkenazi | ASH | KL | 11 | Malaysians | MLY | ||
| KL | 51 | Greeks | GRK | AG | 25 | Austronesians, Indonesia | ASN | ||
| 1 KG | 107 | Tuscans, Italy | TSI | AG | 21 | Ati, Philippines | ATI | ||
| KL | 27 | Roman Jews | RMJ | AG | 20 | Rampasasa,Flores,Indonesia | FLR | ||
| KL | 35 | Sardinians | SRD | AG | 29 | Aeta, Philippines | AET | ||
| 1 KG | 107 | Iberians | IBS | Pacific | KL | 22 | Papuans, New Guinea | PNG | |
| KL | 88 | Hungarians | HGR‡ | KL | 23 | Nasioi, Bougainville | NAS | ||
| 1 KG | 99 | N&W Euro.Ancestry | CEU | KL | 34 | Micronesians | MCR | ||
| KL | 85 | EuroAmericans | EAM | KL | 9 | Samoans | SMO | ||
| 1 KG | 91 | Great Britain | GBR | Americas | KL | 56 | Plains AmerIndians | NPA‡ | |
| KL | 114 | Irish | IRI‡ | KL | 51 | SW AmerIndians | SWA | ||
| KL | 51 | Danes | DAN | KL | 53 | Pima, Mexico | PMM‡ | ||
| KL | 42 | Chuvash | CHV | KL | 50 | Maya | MAY | ||
| KL | 47 | Russia,Vologda | RUV | KL | 12 | Guihiba | GHB | ||
| KL | 33 | Russians,Archangel | RUA | KL | 22 | Quechua | QUE | ||
| KL | 35 | Finns | FIN | 1 KG | 85 | Peruvians | PEL | ||
| 1 KG | 99 | Finns | FN1 | KL | 65 | Ticuna | TIC | ||
| W.Siberia | KL | 46 | Komi Zyriane | KMZ‡ | KL | 44 | Surui,Rondonia | SUR | |
| SoCenAsia | AG | 15 | Pathans, Pakistan | PHA | KL | 55 | Karitiana | KAR | |
| AG | 20 | Gujjar, Pakistan | GJJ | Admixed | 1 KG | 64 | MexAmr,LosAngeles | MXL | |
| 1 KG | 96 | Punjabi, Lahore | PJL | Admixed | 1 KG | 104 | Puerto Ricans | PUR | |
| 1 KG | 103 | Gujarati, Houston | GIH | Admixed | 1 KG | 94 | Colombians | CLM |
‡ = Indicates 15 populations in analyses of North Asia along with outlier groups from nearby regions of the world. Source: KL = Kidd Lab, AG = GenomeAsia 100 K project, 1 KG = Thousand Genomes consortium Abrv: Three character abbreviation for populations used in various images and tables.
Figure 1Scatterplot of Ae by In for 58 microhaps evaluated on 122 populations. See text.
Figure 2STRUCTURE results by population averages for 122 populations at K = 9. The result with the highest likelihood is shown as population averages. AMR indicates the American populations from the 1000 Genomes dataset that have very low Native American ancestry and have predominantly European and some West African ancestry. The plots for K = 7 through 9 are shown in Supplemental Fig. S2; plots of likelihoods for K values are in Supplemental Fig. S5.
Figure 3PCA results for 122 populations. (a) PC1 by PC2. (b) PC1 by PC3.
Figure 4STRUCTURE results by population averages for 74 populations at K = 7 through K = 9. At each K value the result with the highest likelihood is shown as population average. This subset of the 122 populations is in the same order as in Table 2. The plots of individuals for these three K values are shown in Supplemental Fig. S4; plots of likelihoods for K values are in Supplemental Fig. S6. In the lower part of the figure, individual bar plots are shown for East Asia with the individuals sorted by the major cluster values within each population.
Figure 5PCA results for 74 populations. (a) PC1 by PC2. (b) PC1 by PC3.
Figure 6Best exact Least Squares tree for 74 populations. This best of 445 different trees examined has no internal negative segments but does have eight external negative segments to the CHV, FIN1, GIH, PJL, BEB, PEL, BUR, and JPT populations. In their local positions in the graph the segments are small in absolute value and are indicative of small deviations from true additivity in their pairwise genetic distances. They generally are located in reasonable locations with respect to what is known about their relationships. For example, the FIN1 sample from 1 KG clusters with the independent FIN sample from Kidd Lab.
Figure 7STRUCTURE results by population averages for 15 populations at K = 5 through K = 7. At each K value the result with the highest likelihood is shown as population averages. The plots of individuals for these three K values are shown in Supplemental Fig. S6; plots of likelihoods for K values are in Supplemental Fig. S8.
Figure 8PCA results for 15 populations. (a) PC1 by PC2. (b) PC1 by PC3.
Figure 9Best exact Least Squares tree for 15 populations. Negative branches to the Buryat, Altai Kazakhs, and Khanty are orange.