Literature DB >> 34976009

The Heart of Silk Road "Xinjiang," Its Genetic Portray, and Forensic Parameters Inferred From Autosomal STRs.

Atif Adnan1,2,3, Adeel Anwar4, Halimureti Simayijiang5, Noor Farrukh2, Sibte Hadi2, Chuan-Chao Wang3,6,7, Jin-Feng Xuan1.   

Abstract

The Xinjiang Uyghur Autonomous Region of China (XUARC) harbors almost 50 ethnic groups including the Uyghur (UGR: 45.84%), Han (HAN: 40.48%), Kazakh (KZK: 6.50%), Hui (HUI: 4.51%), Kyrgyz (KGZ: 0.86%), Mongol (MGL: 0.81%), Manchu (MCH: 0.11%), and Uzbek (UZK: 0.066%), which make it one of the most colorful regions with abundant cultural and genetic diversities. In our previous study, we established allelic frequency databases for 14 autosomal short tandem repeats (STRs) for four minority populations from XUARC (MCH, KGZ, MGL, and UZK) using the AmpFlSTR® Identifiler PCR Amplification Kit. In this study, we genotyped 2,121 samples using the GoldenEye™ 20A Kit (Beijing PeopleSpot Inc., Beijing, China) amplifying 19 autosomal STR loci for four major ethnic groups (UGR, HAN, KZK, and HUI). These groups make up 97.33% of the total XUARC population. The total number of alleles for all the 19 STRs in these populations ranged from 232 (HAN) to 224 (KZK). We did not observe any departures from the Hardy-Weinberg equilibrium (HWE) in these populations after sequential Bonferroni correction. We did find minimal departure from linkage equilibrium (LE) for a small number of pairwise combinations of loci. The match probabilities for the different populations ranged from 1 in 1.66 × 1023 (HAN) to 6.05 × 1024 (HUI), the combined power of exclusion ranged from 0.999 999 988 (HUI) to 0.999 999 993 (UGR), and the combined power of discrimination ranged from 0.999 999 999 999 999 999 999 983 (HAN) to 0.999 999 999 999 999 999 999 997 (UGR). Genetic distances, principal component analysis (PCA), STRUCTURE analysis, and the phylogenetic tree showed that genetic affinity among studied populations is consistent with linguistic, ethnic, and geographical classifications.
Copyright © 2021 Adnan, Anwar, Simayijiang, Farrukh, Hadi, Wang and Xuan.

Entities:  

Keywords:  Kazakh; Uyghur; allelic frequency database; autosomal STRs; han; hui; phylogenetics

Year:  2021        PMID: 34976009      PMCID: PMC8719170          DOI: 10.3389/fgene.2021.760760

Source DB:  PubMed          Journal:  Front Genet        ISSN: 1664-8021            Impact factor:   4.599


1 Introduction

The Altai is a mountain range in Central and East Asia and stretches around Kazakhstan and Russia in the west while Xinjiang in the northwest China and Mongolia in the East. In ancient times, Europeans were settled on the western side and Asians on the eastern side of these mountains (Li et al., 2019). Later on, the region was used as a corridor by European and Asian populations. This corridor played an important role in the formation of new ethnic groups and diversity of populations today (Ovodov et al., 2011). The Xinjiang Uyghur Autonomous Region (XUAR) is the most diversified region in China, and it is the home of almost 50 ethnic groups. XUAR has played a vital role in the early ages because it was connecting not only the Altai mountain range corridor but also western Eurasia and eastern Eurasia (Esposito, 1999). It was also the main hub for the famous Silk Road, which linked trade between the Middle East, East Asia, Central Asia, South Asia, and Europe (Esposito, 1999). XUAR is divided into two basins, Dzungarian (North Basin) and Tarim (South Basin). Many ethnic groups, including the Uyghur (UGR), Kazakh (KZK), Hui (HUI), Han (HAN), Manchu (MCH), Mongols (MGL), Kirgiz (KGZ), and Uzbek (UZK) have lived there for hundreds of years (Millward, 2007). The Uyghurs are one of the officially recognized ethnic minorities in China, and most of them live in the southern basin (Tarim Basin). The official language is Mandarin; however, other languages are spoken in Xinjiang. The Uyghur language belongs to the Karluk branch of the Turkic language family. Since ancient times, Xinjiang has been a multiethnic region and has been mentioned in Chinese records from the 206 BC to 220 AD (Millward, 2007). The Uyghurs are the main ethnic group in Xinjiang and represent 45.84% of the total population (Millward, 2007; Wyatt and Di Cosmo, 2011). According to some historians, the Kazakh (KZK) population emerged around the 13th century. The ancestors of modern-day Kazakhs were the result of an admixture of this group with Mongol tribes from Eastern Chagatai Khanate. They speak the Kazakh language which also belongs to the Turkic language family. Till the 15th century, they were part of nomadic tribes. Kazakh is the third largest ethnic group of Xinjiang and represents 6.50% of the total Xinjiang population (Weller, 2006; Weatherford, 2012). Ancestors of Hui came to China from Central Asia and Islamic Persia as handicraftsmen, merchants, scholars, and soldiers. This human movement started in the seventh century and continued until the 13th century. After settling in China, they started intermingling with Han Chinese, Mongols, and Uyghurs. Ultimately, their phenotype, cultural characteristics, and language became thoroughly Chinese. Hui is the fourth largest ethnic group in Xinjiang and represents 4.51% of the total Xinjiang population (Dillon, 1999). According to historical records, all Han Chinese can trace their origins to the Huaxia tribes which were agricultural tribes living along the Yellow River and formed during the Shang and Zhou dynasties (21st–8th centuries BC) (Cioffi-Revilla and Lai, 1995). With the rise of the Han Dynasty, all these tribes formed a common tribe/ethnic group known as the Han Chinese (Du and Vincent, 1993). Han Chinese is the world’s largest ethnic group and accounts for 18% of the worldwide population. Han is the major dominating group not only in mainland China (92%) but also in Singapore (75%). Han is the second largest ethnic group of Xinjiang and represents 40.48% of the total Xinjiang population. DNA regions with repeat units of 2–6 bps in length are called short tandem repeats (STRs), also identified as microsatellites. STR markers are present throughout the human genome and usually have stable polymorphisms, short sequence lengths, and a dense, uniform chromosomal distribution, which makes their detection and analysis smooth using PCR and sequencing (Hammond et al., 1994; Sánchez-Diz et al., 2009). In forensic investigations such as paternity cases, rape cases, kinship analysis, and missing person analysis; STRs are considered as markers of choice because of their high polymorphism (Adnan et al., 2017; 2018b; 2018c). The Goldeneye™ 20A is a five-dye kit (Beijing PeopleSpot Inc., Beijing, China), which includes 16 combined DNA index system (CODIS) core STR loci along with Penta E, Penta D, and D6S1043 (Huang et al., 2013). There are few studies available that focus on the characterization of autosomal STRs in the main ethnic groups of Xinjiang such as Uyghur (Yuan et al., 2016) and Kazakh (Zhang et al., 2016b). The drawbacks with previous studies are that they only focus on one or a maximum of two ethnic groups. This study focused on four main ethnic groups UGR, KZK, HUI, and HAN from Xinjiang to genotype 19 autosomal STRs using the Goldeneye 20A kit. We combined the data generated in this study with our previously published work focusing on Manchu (MCH), Mongols (MGL), Kirgiz (KGZ), and Uzbek (UZK) from Xinjiang. We then compared the genotypic data with 97 other worldwide populations.

2 Materials and methods

2.1 Samples and DNA extraction

Blood samples were collected from 2,121 (533 Uyghur (F = 230, M = 303), 436 Kazakh (F = 204, M = 232), 593 Hui (F = 257, M = 336), and 559 Han (F = 213, M = 346) unrelated healthy individuals from the XUAR. All participants gave their informed consent either orally and with thumbprints (in case they could not write) or in writing after the study aims and procedures were carefully explained to them in their language. The study was approved by the ethical review board (dated March 20, 2019, with approval reference no. 2019-84-P) of the China Medical University, Shenyang, Liaoning Province, People’s Republic of China. All blood samples were stored at -20°C before DNA extraction. DNA was isolated from blood using the ReliaPrep™ Blood gDNA Miniprep System (Promega, Madison, WI, USA) according to the manufacturer’s instructions. The quantities of extracted DNA samples were determined using a NanoDrop spectrophotometer (Thermo Scientific, Wilmington, DE, USA), and the final concentration of DNA was diluted to 1–2 ng/μl.

2.2 PCR amplification

Multiplex PCR amplification of 19 autosomal STR loci (D18S51, D21S11, TH01, D3S1358, FGA, TPOX, D8S1179, vWA, CSF1PO, D16S539, D7S820, D13S317, D2S1338, D19S433, D5S818, D12S391, D6S1043, Penta D, Penta E, and the gender marker Amelogenin) was performed using the Goldeneye™ 20A kit (Beijing PeopleSpot Inc.) for all extracted DNA samples. One to 2 ng of extracted DNA was amplified according to the manufacturer’s recommended protocol. Thermal cycling was conducted in a GeneAmp PCR System 9700 thermal cycler (Applied Biosystems, Foster City, CA, USA) using the following conditions: 95°C for 5  min; 30 cycles of 94°C for 30  s, 60°C for 60  s, 70°C for 60  s; and a final extension of 60°C for 30 min.

2.3 Genotyping

Amplified products were prepared using the Goldeneye™ ORG 500 internal size standard (Beijing PeopleSpot Inc.) and HiDi formamide. All samples were electrophoresed using a 3,500 genetic analyzer (Applied Biosystems, Foster City, CA, USA) according to the Goldeneye™ 20A standard protocol. POP-4™ polymer (Life Technologies, Carlsbad, CA, USA) was used for capillary electrophoresis (CE). The Goldeneye™ 20A kit allelic ladder was run for each CE injection. GeneMapper™ Software version 4.0 (Life Technologies) was used for calling all alleles, which was based on the allelic ladder which followed the ISFG recommendations in this regard (Gusmão et al., 2006; Bodner et al., 2016)

2.4 Quality control

All extraction and quantitation batches included sterile water (sH2O) as negative control. Negative (sH2O) and positive (AmpFlSTR® Control DNA 9947A) controls were employed for PCR amplification and capillary electrophoresis. All negative controls displayed an absence of amplified product while positive controls were consistent with known genotypes.

2.5 Statistical analysis

The allele frequencies of STR loci, exact tests of the Hardy–Weinberg Equilibrium (HWE), and pair linkage disequilibrium tests were calculated with PowerMarker V3.25 [13]. The values for matching probability (MP), power of discrimination (PD), polymorphism information content (PIC), power of exclusion (PE) duos, typical paternity index (TPI), gene diversity (GD), and expected heterozygosity (He) were calculated by using the PowerStats software (Ver 1.2, Promega, Madison, WI, USA) [14]. Nei’s standard genetic distance between current and previously published reference populations (Law et al., 2002; Berti et al., 2005; Tie et al., 2006; Barni et al., 2007; Deng et al., 2007; Maruyama et al., 2008; Demeter et al., 2010; Stepanov et al., 2010; Zhu et al., 2010; Deng et al., 2011; Kee et al., 2011; Xing et al., 2011; Yoo et al., 2011; Chen et al., 2012; Di Cristofaro et al., 2012; Westen et al., 2012; Gaviria et al., 2013; Hedjazi et al., 2013; Nasibov et al., 2013, 2013; Park et al., 2013; Bentayebi et al., 2014; Fujii et al., 2014; Almeida et al., 2015; Gurkan et al., 2015; Parolin et al., 2015; Shafique et al., 2015; Shrivastava et al., 2015; Zhang, 2015b, 2015a; Zhang et al., 2015; Aguilar-Velázquez et al., 2016; Hossain et al., 2016; Nakamura et al., 2016; Ng et al., 2016; Park et al., 2016; Ramos-González et al., 2016; Ristow et al., 2016; Shan et al., 2016; Tokdemir et al., 2016; Vullo et al., 2016; Wang et al., 2016; Zhang et al., 2016a, 2016b; Choi et al., 2017; Guerreiro et al., 2017; Jin et al., 2017; Liu et al., 2017; Moysés et al., 2017; Ossowski et al., 2017; Perveen et al., 2017; Singh and Nandineni, 2017; Taylor et al., 2017; Wu et al., 2017; Yang et al., 2017; Adnan et al., 2018c; He et al., 2018a; Wang et al., 2018; Liu et al., 2019; Adnan et al., 2020) was computed using the Phylip 3.69 package [15]. Principal component analysis (PCA) based on allele frequency correlation and was conducted using the MVSP v3.22 software (http://www.kovcomp.com). The phylogenetic tree was constructed using MEGA X, and ancestry component dissection was explored by using STRUCTURE v.2.3.4 software (Porras-Hurtado et al., 2013). The model-based analysis employed the length of the burnin period of 100,000 and Markov Chain Monte Carlo (MCMC) step of 100,000 under the “independent allele frequencies” and “LOCPRIOR” models with the K values ranging from 2 to 10.

3 Results and discussion

3.1-Forensic parameters

Raw genotypic data of four ethnic groups (HAN, HUI, UGR, and KZK) are summarized in Supplementary Table S1. The allelic frequencies for four Xinjiang ethnic groups are summarized in Supplementary Table S2. Allelic frequencies ranged from 0.0009 (CSF1PO) to 0.5262 (TPOX) in the UGR population, ranged from 0.0011 (CSF1PO) to 0.5676 (TPOX) in KZK, ranged from 0.0008 (CSF1PO) to 0.5210 (TPOX) in HUI, and ranged from 0.0008 (D16S539) to 0.5152 (TH01) in HAN. A total of 229, 224, 229, and 232 unique alleles were observed for 19 STRs in UGR, KZK, HUI, and HAN populations, respectively. When we compared our newly studied populations on overlapping 15 STRs with our previously published data, 163, 160, 167, and 170 unique alleles were observed in UGR, KZK, HUI, and HAN, respectively, while 152, 165, 153, and 168 unique alleles were observed in the MCH, MOG, KYZ, and UZK populations, respectively. The combined powers of discrimination (CPDs) on 19 STRs were 0.999 999 999 999 999 999 999 997, 0.999 999 999 999 999 999 999 996, 0.999 999 999 999 999 999 999 993, and 0.999 999 999 999 999 999 999 983 in UGR, KZK, HUI, and HAN, respectively. The combined powers of exclusion (CPE) for 19 STRs were 0.999 999 993, 0.999 999 992, 0.999 999 988, and 0.999 999 993 in UGR, KZK, HUI, and HAN, respectively. The matching probabilities (MP) for the different populations were 2.21 × 10−24, 3.98 × 10−24, 6.05 × 10−24, and 1.66 × 10−24 in UGR, KZK, HUI, and HAN, respectively. On combining the data for 15 STRs extracted from our previous studies for populations (MCH, MOG, KYZ, and UZK) and the current study (UGR, KZK, HUI, and HAN), CPDs were 0.999 999 999 999 999 984 833, 0.999 999 999 999 999 990 057, 0.999 999 999 999 999 996 333, 0.999 999 999 999 999 998 244, 0.999 999 999 999 999 997 679, 0.999 999 999 999 999 995 805, 0.999 999 999 999 999 993 455, and 0.999 999 999 999 999 986 23, respectively. CPEs were 0.999 999 416, 0.999 999 483, 0.999 997 932, 0.999 998 973, 0.999 998 617, 0.999 998 811, 0.999 998 151, and 0.999 998 15, respectively. CMPs for the different populations were 1/1.51 × 1017, 1/1.75 × 1018, 1/3.66 × 1018, 1/9.94 × 1018, 1/2.32 × 1018, 1/4.19 × 1018, 1/6.54 × 1018, and 1/1.37 × 1017, respectively. Penta E showed the greatest heterozygosities and PDs in all four newly studied populations. Penta E was also the most polymorphic locus in terms of alleles in the UGR (20), KZK (21), HUI (23), and HAN (21) while THO1 (6), TPOX (6), TH01 (7), and TH01 (7) were least polymorphic. Other forensic parameters such as GD, PIC, MP, PE, TPI, and Penta E loci exhibited a high level of informativeness while TPOX exhibited the least level of informativeness (Table 1). The data showed that the Goldeneye™ 20A panel can be used for forensic identification and parentage testing for the four ethnic groups in the XUAR of China.
TABLE 1

Forensic parameters of the 19 autosomal STR loci in four (Uyghur, Kazakh, Hui and Han) populations residing in Xinjiang Uyghur Autonomous Region (XUAR) in China.

CSF1POD12S391D13S317D16S539D18S51D19S433D21S11D2S1338D3S1358D5S818D6S1043D7S820D8S1179FGAPenta DPentaETH01TPOXvWA
Uyghur
 No. of Alleles101877181518128816810201220679
 GD0.73750.85700.80860.79640.86560.82270.84170.87530.74560.74110.86090.79700.82810.86220.83030.92210.77790.63980.8015
 PIC0.69050.84020.78090.76590.85050.80060.82240.86160.70280.70020.84480.76600.80540.84610.80750.91570.74210.59050.7726
 PM0.11630.03790.06620.07410.03470.05420.04840.03070.10440.10500.03520.07040.05360.03460.05200.01370.08180.17870.0695
 PD0.88370.96210.93380.92590.96530.94580.95160.96930.89560.89500.96480.92960.94640.96540.94800.98630.91820.82130.9305
 Hobs0.73170.86120.79920.81430.84800.81800.84620.86300.72980.71480.85180.77300.84240.84240.81430.89310.74670.64540.7917
 PE0.47900.71700.59760.62580.69100.63290.68730.72070.47590.45160.69840.54990.67990.67990.62580.78130.50420.34890.5838
 TPI1.86363.60142.49072.69193.29012.74743.25003.65071.85071.75333.37342.20253.17263.17262.69194.67541.97411.41012.4009
Kazakh
 No. of Alleles815871913141410816912161221769
 GD0.73550.85150.81960.81240.86410.81330.81380.87650.73960.74300.86610.79770.81690.86260.82930.92590.76260.60520.8106
 PIC0.68660.83320.79370.78390.84880.78900.78920.86270.69430.70280.85040.76600.79270.84620.80680.91980.72450.55540.7834
 PM0.11910.03990.05960.06640.03370.05970.05970.03010.11740.10590.03620.07630.05830.03540.05360.01230.09370.21110.0634
 PD0.88090.96010.94040.93360.96630.94030.94030.96990.88260.89410.96380.92370.94170.96460.94640.98770.90630.78890.9366
 Hobs0.74080.82110.81650.81650.85780.82110.80500.86470.76380.73620.86470.79360.79820.85090.82110.91510.75690.60780.8142
 PE0.49420.63880.63000.63000.71030.63880.60840.72400.53360.48650.72400.58720.59560.69670.63880.82640.52160.30030.6257
 TPI1.92922.79492.72502.72503.51612.79492.56473.69492.11651.89573.69492.42222.47733.35382.79495.89192.05661.27492.6914
Hui
 No. of Alleles914871515171499151110201023779
 GD0.72620.84630.81330.79260.86000.82650.81540.86320.73560.77520.87440.78850.82750.85850.82140.92260.67420.63070.8038
 PIC0.67960.82760.78550.76080.84430.80460.79220.84740.68970.74060.86020.75730.80430.84220.79770.91640.63000.57340.7742
 PM0.12070.04250.06270.07680.03650.05450.05470.03450.10930.08900.03020.07610.05570.03650.05630.01280.15200.19420.0676
 PD0.87930.95750.93730.92320.96350.94550.94530.96550.89070.91100.96980.92390.94430.96350.94370.98720.84800.80580.9324
 Hobs0.72510.84820.80780.78920.84820.84150.80270.83980.71500.79430.84820.77740.84820.83470.80270.91740.65600.65090.8027
 PE0.46820.69140.61350.57920.69140.67820.60410.67490.45190.58840.69140.55780.69140.66500.60410.83100.36350.35650.6041
 TPI1.81903.29442.60092.37203.29443.15432.53423.12111.75442.43033.29442.24623.29443.02552.53426.05101.45341.43242.5342
Han
 No. of Alleles912891714171489181282011217810
 GD0.73060.84350.80520.78740.86130.80720.81060.86240.71810.77060.87390.77590.84620.85380.80020.92180.65430.62490.7959
 PIC0.68620.82380.77620.75390.84530.78120.78610.84660.66770.73500.85970.74190.82630.83660.77480.91550.60850.56270.7641
 PM0.12400.04480.06800.08100.03470.06350.05880.03510.13080.08760.03130.08490.04480.03910.06550.01320.16840.21300.0724
 PD0.87600.95520.93200.91900.96530.93650.94120.96490.86920.91240.96870.91510.95520.96090.93450.98680.83160.78700.9276
 Hobs0.75850.85510.78350.80140.84790.80140.79960.87300.71380.76570.87840.76920.85330.86760.78890.91770.67080.63150.7692
 PE0.52440.70490.56890.60170.69080.60170.59840.74060.44990.53690.75150.54320.70140.72990.57860.83170.38460.33040.5432
 TPI2.07043.45062.30992.51803.28822.51802.49553.93661.74692.13364.11032.16673.40853.77702.36866.07611.51901.35682.1667

GD: gene diversity; PIC: polymorphism information content; Hobs: observed heterozygosity; PD: power of discrimination; PM: matching probability; PE: power of exclusion; TPI: typical paternity index.

Forensic parameters of the 19 autosomal STR loci in four (Uyghur, Kazakh, Hui and Han) populations residing in Xinjiang Uyghur Autonomous Region (XUAR) in China. GD: gene diversity; PIC: polymorphism information content; Hobs: observed heterozygosity; PD: power of discrimination; PM: matching probability; PE: power of exclusion; TPI: typical paternity index.

3.2-Hardy–Weinberg equilibrium

All of the loci were in the Hardy–Weinberg Equilibrium (HWE) in the HUI population (p > 0.05), while only two loci for UGR (vWA and Penta E) and one STR locus for each HAN (D13S317) and KZK (D6S1043) were out of HWE. Subsequently, when we applied sequential Bonferroni correction (Benjamini and Hochberg, 1995) to mitigate the so-called multiple comparison problem (where for a significant p-value of 0.05, 5% of tests are likely to be significant by chance), none of the loci in any of the four populations were found to be out of HWE (Supplementary Table S3). In our previous study (Zhan et al., 2018), none of the loci were out of HWE in the KYZ population, while one for MCH (D7S820), two for MOG (CSF1PO, D19S433), and four for UZK (D18S51, D2S1338, D7S820, and FGA) were out of HWE. However, after sequential Bonferroni correction (Benjamini and Hochberg, 1995), all loci in the four populations studied conformed to HWE.

3.3-Linkage equilibrium

The phenomena of linkage disequilibrium (LD) can be a result of the association between adjacent alleles co-inherited from single, ancestral chromosomes. Particularly for tightly linked genes, if a selection favors individuals with particular combinations of alleles, then it produces LD which can persist for some time. LD between two loci decays gradually in proportion to the recombination rate and time as measured in number of generations. When mutations are under positive selection, the LD surrounding the mutations is maintained because of the hitchhiking effect; thus, longer haplotypes at high frequencies can be maintained within the population. Many LD-based methods have been developed to detect positive selection. Hudson et al. (Hudson et al., 1994) proposed the first method to detect positive selection by measuring haplotype patterns. Using the extended haplotype homozygosity (EHH) test, Sabeti et al. (Sabeti et al., 2006) developed a more robust method to detect positive selection by measuring longer haplotypes at high frequencies. This method was further refined by Voight et al. (Voight et al., 2006), who standardized the EHH test using the genome-wide empirical distributions of EHH. Based on a similar rationale, Wang et al. (Wang et al., 2007) developed a new version of the LD-based method called LD decay. LD can also be caused by the rate of mutation or recombination, random genetic drift, natural selection, founder effects, nonrandom mating, recent admixture, sampling effects, and population substructure (Chakravarti, 1999). Results of exact tests for linkage equilibrium (LE) displayed that p-values of 116 pairwise combinations of STR loci (UGR 32, KZK 31, HUI 32, and HAN 21) were lower than 0.05 and thus showing LD (Supplementary Table S4). Subsequently, when we applied sequential Bonferroni correction (Benjamini and Hochberg, 1995), only 17 pairs were out of LE. These pairs were D6S1043/D2S1338, D2S1338/D5S818, D18S51/D21S11, D12S391/TPOX, TPOX/Penta D, and D6S1043/D5S818 in the UGR population; D7S820/FGA, D18S51/D8S1179, D7S820/Penta E, CSF1PO/D3S1358, and D6S1043/D8S1179 in the KZK population; D2S1338/D5S818, TH01/D18S51, D12S391/FGA, and D13S317/TH01 in the HUI population; and CSF1PO/D3S1358 and FGA/CSF1PO in the HAN population. We tested 171 pairwise LE tests in each population, and a maximum of six was out of LE in each population. The product rule is used to estimate the random chance of detecting a given STR profile within a population. This is done by multiplying the frequencies of each of the genotypes (combination of alleles) found at all loci in the STR profile (Butler and Butler, 2010). Presentation of the “product rule” for calculating the RMP across multiple STRs can be seen in the HAN population (only two pairs were out of LE) while in the other three (UGR, KZK, and HUI) populations it was unlikely to produce significant errors. The HAN population is not an endogamous population while the other three are endogamous populations. There is no trend of external mating in Xinjiang Muslim populations. The same results were observed in our previous study for the Kyrgyz and Uzbek populations.

3.4-Ancestry content analysis with structure

STRUCTURE analysis (which uses model-based clustering algorithm) was done for four populations (Xinjiang Han, Uyghur, Kazakh, and Xinjiang Hui) and publicly available data for seven populations (Mongol, Kyrgyz, Uzbek, Manchu, Liaoning Han, Tibetan, and Jilin Korean). The data were used to explore ancestry content and genetic landscapes of Xinjiang populations. The number of inferred clusters (K) varied from 2 to 10 with 10 repetitions of each K value and a total of 10,000 burnins and 10,000 Markov chain Monte Carlo (MCMC) simulations for each repetition. We observed that the best optimal number of ancestral populations was five (K = 5). Uyghur and Kazakh populations shared most of their genetic components with other Turkic-speaking populations such as Uzbek and Kyrgyz (yellow) while they shared a few genetic components with Tibetan (blue) and Han, Hui, Munch, and Korean populations (pink) (Figure 1). The pink component was a common component that was present in all populations studied. Moreover, two genetic clusters were observed. One cluster mainly contained populations that are descendants of ancient Altai-speaking populations, and the second cluster contained mainly East Asian populations. Uyghur and Kazakh populations appeared to be genetically closer to Altai-speaking populations while Han and Hui populations were closer to East Asian groups. Results of K2-10 are shown in Supplementary Figure S1. PCA on raw genotype data of 11 populations showed 1.39% of variations, while in the second component, 0.95% of variations were observed. We observed two main clusters on these two (1&2) components: the first cluster contains Kyrgyz, Kazakh, and Mongol populations, while the second cluster contains Uzbek, Uyghur, Manchu, Hui, Tibetan, Korean, and Han populations. On PC1 and PC3 (0.87%), a total of 2.26% of variations were observed. Here we observed three clusters. The first cluster contained Kyrgyz, Kazakh, and Mongol populations. The second cluster had Uzbek, Uygur, and Tibetan populations. In the third cluster, typical East Asian populations (Han, Hui, Manchu, and Korean) were grouped together.
FIGURE 1

Genetic structure and population relationship between Xinjiang Han, Uyghur, Kazakh, Hui, Manchu, Kyrgyz, Uzbek, Mongol, and other Chinese populations.

Genetic structure and population relationship between Xinjiang Han, Uyghur, Kazakh, Hui, Manchu, Kyrgyz, Uzbek, Mongol, and other Chinese populations. PC2 (0.95%) and PC3 (0.87%) resulted in three clusters. In the first cluster, only the Tibetan population was grouped while in the second and third clusters all Altai-speaking populations and typical East Asian populations were present, respectively. PC1 and PC4 (0.84%) also gave us interesting results. Here, two clusters were formed. In the first cluster, the Tibetan population along with typical East Asian populations was grouped while in the second cluster all Altai-speaking populations were present. Results of PCA (Figure 2) were in support with STRUCTURE analysis. These results were also consistent with linguistic affinity and also whole-genome sequencing and high-density genotyping data (Xu et al., 2008; Xu and Jin, 2008; Bai et al., 2018).
FIGURE 2

Principal component analysis (PCA) revealed the genetic relationship on the basis of the first four components between Xinjiang Han, Uyghur, Kazakh, Hui, Manchu, Kyrgyz, Uzbek, Mongol, and other Chinese populations.

Principal component analysis (PCA) revealed the genetic relationship on the basis of the first four components between Xinjiang Han, Uyghur, Kazakh, Hui, Manchu, Kyrgyz, Uzbek, Mongol, and other Chinese populations.

3.5 -Xinjiang and worldwide population comparison

We have compared the data of four populations with previously published populations from Xinjiang of northwest China and worldwide regions using AMOVA, employing available data for 15 STR loci. Genetic distances between the HAN population and 12 other populations from Xinjiang (Xinjiang Uyghur, Xinjiang Hui, Xinjiang Kazakh, Xinjiang Kyrgyz, Xinjiang Manchu, Xinjiang Uzbek, Xinjiang Mongols, Uygur-Xinjiang-1, Kazakh-Xinjiang-1, Kumul-Uyghur-Xinjiang-3, Uyghur-Xinjiang-2, Kazakh-Xinjiang-2) based on Nei’s standard formula are listed in Supplementary Table S5A. These Nei’s standard genetic distance values were used to build a neighbor-joining tree (N-J tree) between Xinjiang Han and 12 other populations (Figure 3A). The Hui population (0.0046) from Xinjiang showed the closest genetic distance with HAN followed by the Mongol population (0.0244) from Xinjiang, while the Kazakh-1 population (1.1693) showed the greatest genetic distance which is followed by the Kyrgyz population (0.4547) from Xinjiang. The first three components (extracting 91.13% genetic variations) of PCA based on allelic frequencies of 15 STRs showed that Altai-speaking populations were closely linked ( Figure 3B ). A heat map (Figure 3C) using the genetic distances showed two clusters. In the first cluster, Manchu, Kyrgyz, Uzbek, and Kazakh-1 were grouped together while the second cluster contained other nine populations. According to Zhang et al. (Zhang et al., 2016a), Uyghurs are genetically closer to the central Asian population and Mongolian populations from East Asia. Jin et al. (Jin et al., 2017) found that these Turkic language-speaking groups placed themselves in the middle of European and East Asian populations. Feng et al. (Feng et al., 2017) used genome-wide human SNP array and found that the Uyghur population from XUAR have four major ancestral components, which were the result of two earlier admixed groups: one of them was from the West containing European (25%–37%) and West South Asian ancestries (12–20%), while the second one was from the East, with Siberian (15%–17%) and East Asian (29%–47%) ancestries. Results of MultiWaver showed us a two-wave admixture. The earliest wave was ∼3,750  years ago (ya), and a recent wave ∼750 ya. According to Seidualy et al. (Seidualy et al., 2020), the Kazakh population has a mixed ancestry containing East Asian (32.8%), European (30.8%), North Asians (28.9%), and South Asians (6%). Wen et al. (Wen et al., 2021) found that the Northwest Chinese Kyrgyz showed a high percentage of Y haplogroup R1a1a1b2a2a-Z2125, which is related to Bronze Age Siberians, while the second dominant haplogroup was C2b1a3a1-F3796, related to Medieval Niru’un Mongols, such as the Uissun tribe from Kazakhs. Again, Wen et al. (Wen et al., 2020) found that the Kazakh population from China showed the highest frequency (80%) of haplogroup C2b1a3a1-F3796 (previous C3*-Star Cluster) which is predominantly found in Mongolian descendent populations. Wang et al. (Wang et al., 2019) found that the Hui population has about 70% in total of the paternal ancestry which could be traced back to East Asia and the left 30% to various regions in West Eurasia. Zhao et al. (Zhao et al., 2020) investigated that Mongolian and Kazakh groups derived 6%–40% of their ancestry from West Eurasia while 42%–64% of their ancestry was from East Asia. He et al. (He et al., 2018b) reported that the Uygur population has 36.30% of European-related ancestry while Hui only have 3.66% of it. Liu et al. (Liu et al., 2018) reported that the Tibetan population and Hui population have a genetic affinity with East Asian populations, while the Uygur population showed a similar genetic makeup with South Asian populations.
FIGURE 3

(A) A Neighbor-joining tree explaining the phylogenetic relationship between Xinjiang populations. (B) A heat map of pairwise Nei’s genetic distance values between Xinjiang populations. (C) Principal component analysis (PCA) based on Nei’s genetic distance revealed by the first two components between Xinjiang populations.

(A) A Neighbor-joining tree explaining the phylogenetic relationship between Xinjiang populations. (B) A heat map of pairwise Nei’s genetic distance values between Xinjiang populations. (C) Principal component analysis (PCA) based on Nei’s genetic distance revealed by the first two components between Xinjiang populations. Genetic distances among the HAN population and 104 worldwide populations were calculated using Nei’s standard formula and summarized in Supplementary Table S4D. Among worldwide populations, Asians living in Australia (0.0123) showed the closest association with the HAN population followed by the Chamorro population (0.0574); on the other hand, the Haitian population (0.2322) showed a distant association followed by the amaXhosa population (0.2251) from South Africa. These Nei’s genetic distance values were used to build a neighbor-joining tree (N-J tree) between Xinjiang Han and 104 worldwide populations (Figure 4). The first 10 components of PCA (PC1 = 25.06%, PC2 = 14.70%, PC3 = 9.80%, PC4 = 6.55%, PC5 = 6.02%, PC6 = 5.40%, PC7 = 4.19%, PC8 = 3.07%, PC9 = 2.39% and PC10 = 2.08%) extracted 79.31% of genetic variations (Figure 5). A heat map of the genetic distance matrix was also generated among 105 worldwide populations (Figure 6). Populations from Xinjiang showed their affinity with Central Asian, South West Asian, West Asian, and East European populations. An interactivity test between these 105 worldwide populations showed that the results were consistent with the PCA and Nei’s formula results described above (Figure 7). Comparisons between only Chinese ethnic groups (Supplementary Figures 2A–C) and Asian ethnic groups (Supplementary Figure 3–C) are discussed in Supplementary Text S1.
FIGURE 4

A Neighbor-joining tree explaining the phylogenetic relationship between Xinjiang Han population and 104 other reference populations from worldwide.

FIGURE 5

A heat map of pairwise Nei’s genetic distance values between Xinjiang Han population and 104 other reference populations worldwide.

FIGURE 6

Principal component analysis (PCA) based on Nei’s genetic distance revealed by the first two components between the Xinjiang Han population and 104 other reference populations from worldwide.

FIGURE 7

Interactivity test between the Xinjiang Han population and 104 other reference populations from worldwide.

A Neighbor-joining tree explaining the phylogenetic relationship between Xinjiang Han population and 104 other reference populations from worldwide. A heat map of pairwise Nei’s genetic distance values between Xinjiang Han population and 104 other reference populations worldwide. Principal component analysis (PCA) based on Nei’s genetic distance revealed by the first two components between the Xinjiang Han population and 104 other reference populations from worldwide. Interactivity test between the Xinjiang Han population and 104 other reference populations from worldwide.

4 Conclusion

In the current study, we genotyped 20 autosomal STR loci in the Han, Uyghur, Kazakh, and Hui ethnic groups of Xinjiang and calculated the forensic parameters. The Goldeneye® 20A panel appeared suitable for forensic investigations such as personal identification and paternity testing and had a high power of discrimination. The STR loci included in the kit showed no significant departures from HWE and minimal departure from LE for a very small number of pairwise combinations of loci. Genetic characterization showed that the Uyghur and Kazakh ethnic groups were closely related to other Turkic-speaking groups while the Han and Hui populations showed their associations with other Sinitic language-speaking populations. Interestingly, the Kazakh population showed an affinity with the Mongols which suggested an ancient divergence between Kazakh and Mongols when Mongols originally migrated to present-day Xinjiang (Adnan et al., 2018a; Zhan et al., 2018; He et al., 2019).
  87 in total

1.  Genetic variability of 15 autosomal STR loci in Russian populations.

Authors:  Vadim A Stepanov; Alexander V Melnikov; Andrey Yu Lash-Zavada; Vladimir N Kharkov; Svetlana A Borinskaya; Tatiana V Tyazhelova; Olga V Zhukova; Yuri V Schneider; Irina N Shil'nikova; Valery P Puzyrev; Anna A Rybakova; Nikolai K Yankovsky
Journal:  Leg Med (Tokyo)       Date:  2010-07-13       Impact factor: 1.376

2.  Genetic polymorphism analysis of 15 STR loci in Chinese Hui ethnic group residing in Qinghai province of China.

Authors:  Ya-jun Deng; Bo-feng Zhu; Chun-mei Shen; Hong-dan Wang; Jing-feng Huang; Yuan-zhe Li; Hai-xia Qin; Hao-fang Mu; Jie Su; Jie Wu; Bo Zhang; Shuan-liang Fan
Journal:  Mol Biol Rep       Date:  2010-11-13       Impact factor: 2.316

3.  Population data for 15 autosomal STR loci in the Dong ethnic minority from Guizhou Province, Southwest China.

Authors:  Lu Zhang
Journal:  Forensic Sci Int Genet       Date:  2015-02-16       Impact factor: 4.882

4.  Population genetic analyses and evaluation of 22 autosomal STRs in Indian populations.

Authors:  Mugdha Singh; Madhusudan R Nandineni
Journal:  Int J Legal Med       Date:  2017-01-06       Impact factor: 2.686

5.  Assessment of application value of 19 autosomal short tandem repeat loci of GoldenEye 20A kit in forensic paternity testing.

Authors:  Yan-Mei Huang; Jie Wang; Zhangping Jiao; Liu Yang; Xinning Zhang; Hui Tang; Yacheng Liu
Journal:  Int J Legal Med       Date:  2013-03-13       Impact factor: 2.686

6.  Population genetic study for 24 STR loci and Y indel (GlobalFiler™ PCR Amplification kit and PowerPlex® Fusion system) in 1000 Korean individuals.

Authors:  Hyun-Chul Park; Kicheol Kim; Younhyoung Nam; Jihye Park; Jinmyung Lee; Hyehyeon Lee; Hansol Kwon; Hanjun Jin; Wook Kim; Won Kim; Sikeun Lim
Journal:  Leg Med (Tokyo)       Date:  2016-06-21       Impact factor: 1.376

Review 7.  Positive natural selection in the human lineage.

Authors:  P C Sabeti; S F Schaffner; B Fry; J Lohmueller; P Varilly; O Shamovsky; A Palma; T S Mikkelsen; D Altshuler; E S Lander
Journal:  Science       Date:  2006-06-16       Impact factor: 47.728

8.  Genetic characterization of Y-chromosomal STRs in Hazara ethnic group of Pakistan and confirmation of DYS448 null allele.

Authors:  Atif Adnan; Allah Rakha; Kadirya Kasim; Anam Noor; Shahid Nazir; Sibte Hadi; Hao Pang
Journal:  Int J Legal Med       Date:  2018-10-30       Impact factor: 2.686

9.  Developmental Validation of the Huaxia Platinum System and application in 3 main ethnic groups of China.

Authors:  Zheng Wang; Di Zhou; Zhenjun Jia; Luyao Li; Wei Wu; Chengtao Li; Yiping Hou
Journal:  Sci Rep       Date:  2016-08-08       Impact factor: 4.379

10.  Phylogenic analysis and forensic genetic characterization of Chinese Uyghur group via autosomal multi STR markers.

Authors:  Xiaoye Jin; Yuanyuan Wei; Jiangang Chen; Tingting Kong; Yuling Mu; Yuxin Guo; Qian Dong; Tong Xie; Haotian Meng; Meng Zhang; Jianfei Li; Xiaopeng Li; Bofeng Zhu
Journal:  Oncotarget       Date:  2017-05-18
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.