Literature DB >> 31989793

Comprehensive genetic structure analysis of Han population from Dalian City revealed by 20 Y-STRs.

Atif Adnan1,2, Kaidirina Kasimu3, Allah Rakha4, Guanglin He5, Tongya Yang2, Chuan-Chao Wang5, Jie Lu2, Jin-Feng Xuan1.   

Abstract

BACKGROUND: Dalian is a city formed in the 1880s in Liaoning province, Northeastern China with a population of 6.69 million now. Han is the largest ethnic group not only across Mainland China (92%) and Taiwan (97%) but also considered to be the largest ethnic group of the world contributing to above 18% of world's population.
METHODS: In the current study, we genotyped Goldeneye® 20Y System loci in 879 unrelated male individuals from the Han ethnic group in Dalian city and calculated the forensic parameters of the 20 Y-STR loci.
RESULTS: In total, we observed 855 haplotypes, among which 835 (94.99%) were unique. The discrimination capacity (DC) of overall Goldeneye® 20Y System is 97.27% and it slightly reduces to 96.93% when only Y-filer® set of 17 Y-STRs were used, which mitigates using the extended set of markers in this population. We found DYS388 showed the lowest gene diversity (0.5151), whereas DYS389II showed the highest gene diversity (0.7621) in single copy Y-STR, and DYS385 showed the highest gene diversity (0.9683) among all.
CONCLUSION: Multidimensional scaling (MDS) analysis based upon pairwise Rst genetic distance showed difference among Han population from the east to the west and from the north to the south. We also predicted haplogroups using Y-STR haplotypes, which showed the dominance of Haplogroup O (65.2%) followed by Haplogroup C (14.5%) in Dalian Han population. Moreover, we found 10 individuals showed a null allele at the DYS448 in our samples. We also performed linear discriminatory analysis (LDA) between Han and other prominent Chinese minority ethnic groups. We presented Y-STRs data in the Y-Chromosome Haplotype Reference Database (YHRD) for the future forensic and other usage.
© 2020 The Authors. Molecular Genetics & Genomic Medicine published by Wiley Periodicals, Inc.

Entities:  

Keywords:  Dalian; Han; O haplogroup; Y-Chromosome Haplotype Reference Database (YHRD); Y-STRs; forensic genetics

Year:  2020        PMID: 31989793      PMCID: PMC7057124          DOI: 10.1002/mgg3.1149

Source DB:  PubMed          Journal:  Mol Genet Genomic Med        ISSN: 2324-9269            Impact factor:   2.183


INTRODUCTION

Dalian city is renowned as the trading and financial center of Northeastern Asia, with the Shandong Peninsula lying southwest across the Bohai Strait and Korea lying across the Yellow Sea to the east. Han Chinese accounts for 84% of the Dalian population, followed by 13% Manchu, 1.6% Mongol, and 0.7% Hui. Han is the world's largest ethnic group making up to about 18% of the global population (2011). The name “Han” first came from the Han Dynasty, which is considered as the golden era of Chinese civilization. During the Han Dynasty, China was able to increase its power and influence to other parts of Asia. The word “Han” seems like more a nationality than an ethnic group since many local tribes mixed up together and formed the name Han during that period (Gerent, 1996). Y chromosomal STRs are widely used for sexual assault cases, paternity testing, reconstruction of human population history (Adnan et al., 2018), and missing person investigations (Adnan, Ralf, Rakha, Kousouri, & Kayser, 2016) because of its male inheritance (Rakha et al., 2018). Some previous studies have confirmed the resolution of male lineage differentiations can be improved by carefully adding Y‐STRs to currently available Y‐STRs set, such as Y‐filer (Hanson & Ballantyne, 2004, 2007; Hedman, Neuvonen, Sajantila, & Palo, 2011; Rodig et al., 2008). A recently developed Goldeneye® 20Y kit amplification kit (Goldeneye technology Ltd.) can be used to co‐amplify 20 Y‐STR loci (DYS19, DYS348, DYS385a/b, DYS388, DYS389I, DYS389II, DYS390, DYS392, DYS393, DYS437, DYS439, DYS447, DYS448, DYS458, DYS460, DYS635, Y‐GATA‐H4, DYS391, and DYS456) with five dyes (Gao et al., 2017). To get a better understanding of the paternal genetic structure and characterize the forensic resolution of those 20 Y‐STRs in Han Chinese in Northeastern Asia, here we use the Goldeneye® 20Y amplification kit to genotype the haplotypes in 879 Han males residing in Dalian city. We have explored the relationship between other Han populations (Fan, Zhang, et al., 2018; Kim, Han, Kim, & Kim, 2010; Li, Yu, Li, Jin, & Yan, 2016; Nothnagel et al., 2017; Sun et al., 2019; Wang et al., 2017, 2016; Xu et al., 2016; Yanmei et al., 2010; Zhang et al., 2014; Zhou, Ren, Zhang, Wang, & Huang, 2016; Zhou et al., 2018) from the east to the west and from the north to the south among Han ethnic groups. We have also compared Hans of Dalian city to other regional ethnic groups (Adnan, Kasim, et al., 2019; Bai, Liu, Lv, Shi, & Ma, 2016; Bian et al., 2016; Chen et al., 2018; Du et al., 2019; Fan et al., 2019; Fan, Wang, Chen, Long, et al., 2018; Fan, Wang, Chen, Zhang, et al., 2018; Gayden, Bukhari, Chennakrishnaiah, Stojkovic, & Herrera, 2012; Guo, 2015; Guo, Song, & Zhang, 2016; Guo, Zhang, & Jiang, 2015; He & Guo, 2013; Ji et al., 2017; Ou et al., 2015; Shan et al., 2014; Shi, Bai, Bai, & Yu, 2011; Song et al., 2019; Wang et al., 2018; Yao et al., 2016; Zhu et al., 2006) from East Asia and the world.

MATERIALS AND METHODS

Sample collection and ethical approval

The ethical review board of the China Medical University, Shenyang Liaoning Province, People's Republic of China approved this study in accordance with the standards of the Declaration of Helsinki. Blood stain samples were collected from 879 unrelated individuals who are residents of Dalian city for at least three generations and are confirmed as Han Chinese (Figure 1). Aims and procedures of the study were explained to all the volunteers before they have signed the informed consents.
Figure 1

Geographic position of Dalian city, from where samples were collected for this study

Geographic position of Dalian city, from where samples were collected for this study

DNA extraction, PCR amplification, and fragment length analysis

Phenolchloroform procedure was used to extract DNA. Quantification of DNA was carried out using QuantifilerTM Human DNA Quantification Kit (Applied Biosystems) according to the manufacturer's instructions. GoldeneyeTM 20Y amplification kit (Goldeneye technology Ltd.) contains 20 Y‐STR loci were co‐amplified in a GeneAmp® PCR 9700 (Life Technologies) thermal cycler according to the manufacturer's recommendations. Allele separation and detection were performed with reference to ORG 500 internal size standard (GoldeneyeTM) and GoldeneyeTM 20Y Allelic Ladder using an ABI 3500 genetic analyzer (Applied Biosystems) in accordance to the GoldeneyeTM 20Y amplification kit (Goldeneye technology Ltd.) recommendations. Allele calling was performed with GeneMapper 3.2 software.

Confirmation of DYS448 deletions and sequencing

PowerPlex® Y23 System Amplification Kit (Promega) and MicroreaderTM 29Y Direct ID System were used to confirm the null alleles at DYS448 in 10 samples according to the manufacturer's protocol. These samples were later on sequenced as described elsewhere (Adnan, Rakha, et al., 2019).

Statistical analysis

Haplotype and allelic frequencies were calculated by direct counting method and haplotype diversity (HD) was calculated according to:where n is the male population size and p is the frequency of ith haplotype. The discrimination capacity (DC) was calculated as the proportion of different haplotypes over a total number of samples. Rst and Fst pairwise genetic distances and associated probability values (p values, 10,000 permutations) were calculated using the analysis of molecular variance (AMOVA) on YHRD website. Reduced dimensionality spatial representation of the populations based on Rst values was performed using multidimensional scaling (MDS) with IBM SPSS Statistics for Windows, Version 23.0 (IBM Corp.). Fst values were utilized to construct a neighbor‐joining tree using Mega 7.0 (Kumar, Stecher, & Tamura, 2016). Linear discriminant analysis (LDA) was performed using R program (R Core Team, 2015). We also predicted Y‐SNP haplogroups of the samples from Y‐STR haplotypes using the Y‐DNA Haplogroup Predictor NEVGEN (http://www.nevgen.org).

RESULTS

The 20 Y‐STRs were successfully genotyped using Goldeneye® 20Y System kit in 879 unrelated Han males from Dalian city of Northeast of China. Haplotypes generated in 879 Han individuals are listed in Table S1. We have used these haplotypes to predict haplogroups using online haplogroup predictor NevGen and listed the results in Table S1. The data for 17 Y‐filer loci can also be accessed via YHRD with accession number YA004552.

Gene diversity and allele frequency of 20 Y‐STRs in Han individuals

Allelic frequencies of 20 Y‐STRs along with gene diversity (GD) values in 879 Han male individuals from Dalian city are listed in Table S2. A total of 214 alleles were observed. DYS385a/b was the most polymorphic loci with 78 different combinations of alleles. The allelic frequency ranged from 0.0011 to 0.7008. GD ranged from 0.37971 (DYS391) to 0.81769 (DYS447) for single copy STR, while multi‐copy STR (DYS385a/b) showed the highest diversity 0.96836. Overall, GD value for all 20 STRs was 0.6692. A low diversity value for DYS391 was also observed in previously studied populations (Ou et al., 2015; Wu et al., 2011).

Haplotype diversity

To check the haplotype diversity of 20 Y‐STRs in 879 Han individuals from Dalian, we evaluated our data for the minimal nine loci, the extended 11 loci, PowerPlex 12 loci, Y‐filer 17 loci, and Goldeneye 20Y loci as are shown in Table 1.
Table 1

Forensic statistical parameters in Dalian Han population at five different levels

Forensic parameters9 Y‐STRS11 Y‐STRS12 Y‐STRs17 Y‐STRs20 Y‐STRs
Total879879879879879
Haplotypes731800808852855
Unique Haplotypes627740754829835
DC0.8316268490.9101251420.9192263940.96928330.972696246
PUH0.713310580.8418657570.8577929470.94311720.949943117
RMP0.00170.00140.00140.00120.0012
HD0.99940.99970.99970.99990.9999

Abbreviations: DC, discriminatory capacity; HD, haplotype diversity; PUH, proportion of unique haplotypes; RMP, random matching probability.

Forensic statistical parameters in Dalian Han population at five different levels Abbreviations: DC, discriminatory capacity; HD, haplotype diversity; PUH, proportion of unique haplotypes; RMP, random matching probability. At minimal haplotype of 9 STRs, a total of 731 haplotypes were observed with DC 83.16% and GD value 0.9994. At the SWGDAM 11 STRs, a total of 800 haplotypes were observed and DC value increased to 1.03 fold (91.01%). At the PowerPlex Y 12 STRs, the number of unique haplotype increased to 754 compared to that of SWGDAM 11 STRs of 740 haplotypes while other parameters remained static. With the addition of 5 STRs to PowerPlex Y 12, a total of 852 haplotypes were observed at Y‐filer 17 STRs, among which 94.31% were unique with DC 96.92% and GD value 0.9999. Finally, when adding another three STRs to Y‐filer set, a total of 855 haplotypes were observed, but most of the parameters remained static compared to those of the Y‐filer STRs.

Genetic differences along the landscape of China among Han and other minorities from China

To analyze the relationship between other regional Han ethnic groups and currently studied Han individuals from Dalian in Northeastern China, we have collected data from 10 Han groups and 34 minorities from the east to the west and from the north to the south across China. We have calculated Fst and Rst pairwise genetic distance, which is commonly used for estimating the population differences and computing the genetic relationships among different populations. According to Fst pairwise genetic distance (Table S3), Han from Anhui and Shanghai both in Eastern China have the closest genetic distance with our studied Han group, whereas Fujian Han in Southern China and Zigong in Southwestern China have the largest genetic distances. When we compared our dataset with the other 34 minority groups of China, we found that Manchu (0.0001) and Xibe (0.0001) from Liaoning in Northeastern China have the closest genetic distance with our Han group while the longest genetic distance seen in Bonan (0.0181) and Tuva (0.0250) ethnic groups from Northern China (Table S4). We then inferred the evolutionary relationships between the Han group and other reference populations from the Neighbor‐joining (NJ) phylogenetic tree on the basis of the F values (Figures 2 and 3). We found genetically close related groups clustered tightly in a clade while genetically distant groups separated far away. All the 35 ethnic groups were divided into six clades according to the NJ tree. She, Han, Danmin, Lingao, Kejia, Yi, Yao, Korean, Bai, and Lisu have the shortest genetic distances and clustered together in a clade. While Uyghur and Kazakhs groups lay at the terminal node of the outermost branch to the Han, which is in accordance with their geographic distances and previous studies that they have a large amount of West Eurasian related admixture. Multidimensional scaling (MDS) analysis based on Rst values between Han ethnic groups from different geographic locations of China revealed that Han population from Dalian are relatively isolated when compared to other Han groups (Figure 4; Table S5). MDS plot between Han and 34 other minorities of China showed that the Han population formed a close cluster with Manchu, Liqian, She, and Xibe ethnic groups (Figure 5; Table S6). We also performed LDA in this study. The results indicated that the Han population was probably an admixture of other Chinese populations with the exception of Uyghur and Tibetan populations (Figure 6).
Figure 2

Neighbor‐joining phylogenetic tree for 11 Han ethnic groups from the east to the west, from the north to the south based on a distance matrix of Fst

Figure 3

Neighbor‐joining phylogenetic tree for 35 Chinese ethnic groups based on a distance matrix of Fst

Figure 4

Two‐dimensional plot from multidimensional scaling analysis of Rst values based on Y‐filer haplotypes for 11 Han ethnic groups from east to west, north to south

Figure 5

Two‐dimensional plot from multidimensional scaling analysis of Rst values based on Y‐filer haplotypes for 35 Chinese ethnic groups

Figure 6

LDA Analysis between 10 major Chinese ethnic groups

Neighbor‐joining phylogenetic tree for 11 Han ethnic groups from the east to the west, from the north to the south based on a distance matrix of Fst Neighbor‐joining phylogenetic tree for 35 Chinese ethnic groups based on a distance matrix of Fst Two‐dimensional plot from multidimensional scaling analysis of Rst values based on Y‐filer haplotypes for 11 Han ethnic groups from east to west, north to south Two‐dimensional plot from multidimensional scaling analysis of Rst values based on Y‐filer haplotypes for 35 Chinese ethnic groups LDA Analysis between 10 major Chinese ethnic groups Y chromosomal Haplogroup O is mostly found in Eastern Asian parts of the world (Figure 7). In the current study, broadly, we have observed 11 haplogroups, among these Haplogroup O is the most frequent haplogroup (65%). Haplogroup O1 accounts for 48% of the currently studied population while O2 contributes 17% followed by Haplogroup C2 at 14%. The frequency of Haplogroup C2 in Dalian Han is much higher than that found in other Han groups. This high level of C2 haplogroup is justified because Manchuria was the homeland for Mongol‐like‐horsemen‐turned‐merchants (Adnan, Kasim, et al., 2019). The results suggest Han Chinese have genetic admixture with local indigenous populations.
Figure 7

Map showing distributions of predicted haplogroups in Han population from Dalian, China

Map showing distributions of predicted haplogroups in Han population from Dalian, China

Characterization of DYS448 deletions

The position of DYS448 is adjacent to azoospermia factor c (AZFc) region, which plays an important role in spermatogenesis and forms an “ampliconic” repeat that acts as a substrate for non‐allelic homologous recombination (NAHR). The core repeat motif of the DYS448 locus is the hexanucleotide repeat AGAGAT (Redd et al., 2002). DYS448 has two polymorphic domains separated by an invariant 42‐bp region. After the successful sequencing of PCR products, we have submitted data to GenBank for accession numbers. We aligned our sequences with a reference sample that showed allele 20 at the DYS448 from the current study and also from Genebank (accession #MH200582). Out of ten, eight samples showed primer binding site problem on both upstream and downstream while two showed upstream mutations. The phenomena of a null allele at DYS448 in East Asia might be due to the kit itself (Goldeneye, Microreader, and PowerplexY23). We note that the primers of DYS448 designed by above‐mentioned companies are not available publically and companies should validate it properly to yield good results in the future. The frequency of null allele at DYS448 is more frequent in Asia (particular East Asia) than that in other regions. All observed 10 individuals (Table 2) showed DYS448 null alleles belonging to Haplogroup C2. Haplogroup C2 is more frequent in regions once associated with the Mongolian empire in past than other regions.
Table 2

Sequence in the relevant flanking and repeat region of the DYS448 locus 10 null alleles in Dalian Han population

Sequence in the relevant flanking and repeat region of the DYS448 locus 10 null alleles in Dalian Han population

CONCLUSION

In conclusion, for the first time, we have genotyped 855 Dalian Han individuals with 20 Y chromosomal STR loci using Goldeneye® 20Y System kit. The genetic variation in the Dalian Han population and its comparison to other relevant groups were analyzed using different statistical tests. Studies based on uniparental markers identified structural differences among the Han Chinese from Northern, Northwestern, Southern, and Eastern China (Bi et al., 2015) and results of our study are also in accordance with it. These Y‐STRs, which are part of Goldeneye® 20Y System kit, showed strong power of discrimination in Dalian Han population and could potentially be useful for the regional or national reference reconstruction for forensic paternity testing, missing person investigations, and disaster victim identification. Results of our study strongly suggest that we detect null type alleles at the locus DYS448 in 10 individuals, which accounts for 1.13% of all the alleles at this locus. This null allele phenomenon is also reported in other populations (Adnan, Rakha, et al., 2019) but the frequency is higher in Asia than that in other regions. Interestingly, all null allele individuals observed in the current study belong to haplogroup C2 (Huang et al., 2018) (previously known as C3 haplogroup). We suggest the commercial companies should pay special attention while designing primers of DYS448. The current inclusion of this data in the YHRD allows the general use for forensic and other purposes.

CONFLICT OF INTEREST

The authors declare that they have no conflict of interest.

AUTHORS' CONTRIBUTIONS

J.L. and A.A developed the idea. A.A. wrote the manuscript. A.A., K.K., T.Y., G.H., and J.F. conducted the experiment. A.A., K.K., G.H., A.R., C.W., J.L., and J.F. analyzed the results. A.A., C.W., and J.L. modified the manuscript. All authors reviewed the manuscript.

COMPLIANCE WITH ETHICAL STANDARDS

All participants gave their informed consent in writing after the study aims and procedures were carefully explained to them in their own language. The study was approved by the ethical review board of the China Medical University, Shenyang Liaoning Province, People's Republic of China, and was in accordance with the principles of the Declaration of Helsinki. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file.
  48 in total

1.  Dissecting the Finnish male uniformity: the value of additional Y-STR loci.

Authors:  M Hedman; A M Neuvonen; A Sajantila; J U Palo
Journal:  Forensic Sci Int Genet       Date:  2010-04-28       Impact factor: 4.882

2.  Genetic polymorphisms for 11 Y-STRs haplotypes of Chinese Yi ethnic minority group.

Authors:  Bofeng Zhu; Chunmei Shen; Guangli Qian; Ruiyi Shi; Yonghui Dang; Jun Zhu; Ping Huang; Yongcheng Xu; Qianzi Zhao; Jun Ma; Yao Liu
Journal:  Forensic Sci Int       Date:  2005-07-18       Impact factor: 2.395

3.  Population data of 17 Y-STRs (Yfiler) from Punjabis and Kashmiris of Pakistan.

Authors:  Atif Adnan; Allah Rakha; Anum Noor; Mannis van Oven; Arwin Ralf; Manfred Kayser
Journal:  Int J Legal Med       Date:  2017-05-17       Impact factor: 2.686

4.  Forensic and phylogenetic analyses among three Yi populations in Southwest China with 27 Y chromosomal STR loci.

Authors:  Yu-Ran An; Guang-Yao Fan; Chang-Xiu Peng; Ji-Liang Deng; Li-Peng Pan; Yi Ye
Journal:  Int J Legal Med       Date:  2018-12-17       Impact factor: 2.686

5.  Forensic value of 14 novel STRs on the human Y chromosome.

Authors:  Alan J Redd; Al B Agellon; Veronica A Kearney; Veronica A Contreras; Tatiana Karafet; Hwayong Park; Peter de Knijff; John M Butler; Michael F Hammer
Journal:  Forensic Sci Int       Date:  2002-12-04       Impact factor: 2.395

6.  Discriminating power of rapidly mutating Y-STRs in deep rooted endogamous pedigrees from Sindhi population of Pakistan.

Authors:  Allah Rakha; Yu Na Oh; Hwan Young Lee; Safdar Hussain; Ali Muhammad Waryah; Atif Adnan; Kyoung-Jin Shin
Journal:  Leg Med (Tokyo)       Date:  2018-08-04       Impact factor: 1.376

7.  Population analysis of 27 Y-chromosomal STRs in the Li ethnic minority from Hainan province, southernmost China.

Authors:  Haoliang Fan; Xiao Wang; Haixiang Chen; Xiaojia Zhang; Peiyu Huang; Ren Long; Anwen Liang; Tao Song; Jianqiang Deng
Journal:  Forensic Sci Int Genet       Date:  2018-01-31       Impact factor: 4.882

8.  Haplotype structure of 27 Yfiler®Plus loci in Chinese Dongxiang ethnic group and its genetic relationships with other populations.

Authors:  Junfang Wang; Shaoqing Wen; Meisen Shi; Yaju Liu; Jing Zhang; Rufeng Bai; Hui Li
Journal:  Forensic Sci Int Genet       Date:  2018-01-01       Impact factor: 4.882

9.  Evaluation of haplotype discrimination capacity of 35 Y-chromosomal short tandem repeat loci.

Authors:  Heike Rodig; Lutz Roewer; Annett Gross; Tom Richter; Peter de Knijff; Manfred Kayser; Werner Brabetz
Journal:  Forensic Sci Int       Date:  2007-05-31       Impact factor: 2.395

10.  Analysis of genetic admixture in Uyghur using the 26 Y-STR loci system.

Authors:  Yingnan Bian; Suhua Zhang; Wei Zhou; Qi Zhao; Ruxin Zhu; Zheng Wang; Yuzhen Gao; Jie Hong; Daru Lu; Chengtao Li
Journal:  Sci Rep       Date:  2016-02-04       Impact factor: 4.379

View more
  1 in total

1.  Comprehensive genetic structure analysis of Han population from Dalian City revealed by 20 Y-STRs.

Authors:  Atif Adnan; Kaidirina Kasimu; Allah Rakha; Guanglin He; Tongya Yang; Chuan-Chao Wang; Jie Lu; Jin-Feng Xuan
Journal:  Mol Genet Genomic Med       Date:  2020-01-27       Impact factor: 2.183

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.