Literature DB >> 34305999

Haplotypic Associations and Differentiation of MHC Class II Polymorphic Alu Insertions at Five Loci With HLA-DRB1 Alleles in 12 Minority Ethnic Populations in China.

Yina Cun1, Lei Shi1, Jerzy K Kulski2, Shuyuan Liu1, Jia Yang3, Yufen Tao1, Xinwen Zhang3, Li Shi3, Yufeng Yao1.   

Abstract

The analysis of polymorphic variations in the human major histocompatibility complex (MHC) class II genomic region on the short-arm of chromosome 6 is a scientific enquiry to better understand the diversity in population structure and the effects of evolutionary processes such as recombination, mutation, genetic drift, demographic history, and natural selection. In order to investigate associations between the polymorphisms of HLA-DRB1 gene and recent Alu insertions (POALINs) in the HLA class II region, we genotyped HLA-DRB1 and five Alu loci (AluDPB2, AluDQA2, AluDQA1, AluDRB1, AluORF10), and determined their allele frequencies and haplotypic associations in 12 minority ethnic populations in China. There were 42 different HLA-DRB1 alleles for ethnic Chinese ranging from 12 alleles in the Jinuo to 28 in the Yugur with only DRB1∗08:03, DRB1∗09:01, DRB1∗12:02, DRB1∗14:01, DRB1∗15:01, and DRB1∗15:02 present in all ethnic groups. The POALINs varied in frequency between 0.279 and 0.514 for AluDPB2, 0 and 0.127 for AluDQA2, 0.777 and 0.995 for AluDQA1, 0.1 and 0.455 for AluDRB1 and 0.084 and 0.368 for AluORF10. By comparing the data of the five-loci POALIN in 13 Chinese ethnic populations (including Han-Yunnan published data) against Japanese and Caucasian published data, marked differences were observed between the populations at the allelic or haplotypic levels. Five POALIN loci were in significant linkage disequilibrium with HLA-DRB1 in different populations and AluDQA1 had the highest percentage association with most of the HLA-DRB1 alleles, whereas the nearby AluDRB1 indel was strongly haplotypic for only DRB1∗01, DRB1∗10, DRB1∗15 and DRB1∗16. There were 30 five-locus POALIN haplotypes inferred in all populations with H5 (no Alu insertions except for AluDQA1) and H21 (only AluDPB2 and AluDQA1 insertions) as the two predominant haplotypes. Neighbor joining trees and principal component analyses of the Alu and HLA-DRB1 polymorphisms showed that genetic diversity of these genomic markers is associated strongly with the population characteristics of language family, migration and sociality. This comparative study of HLA-DRB1 alleles and multilocus, lineage POALIN frequencies of Chinese ethnic populations confirmed that POALINs whether investigated alone or together with the HLA class II alleles are informative genetic and evolutionary markers for the identification of allele and haplotype lineages and genetic variations within the same and/or different populations.
Copyright © 2021 Cun, Shi, Kulski, Liu, Yang, Tao, Zhang, Shi and Yao.

Entities:  

Keywords:  Chinese ethnic populations; HLA class II regions; HLA-DRB1; POALIN; haplotypes; polymorphism

Year:  2021        PMID: 34305999      PMCID: PMC8292818          DOI: 10.3389/fgene.2021.636236

Source DB:  PubMed          Journal:  Front Genet        ISSN: 1664-8021            Impact factor:   4.599


Introduction

The human major histocompatibility complex (MHC) class II genomic region on the short-arm of chromosome 6 contains highly polymorphic classical and non-classical human leukocyte antigen (HLA) class II genes (HLA-DRB1, -DRA, -DQA1, -DQB1, -DQA2, -DQB2, -DPA1, and -DPB1) involved in the regulation of the innate and adaptive immune system, autoimmunity, and transplantation (Shiina et al., 2004, 2009; Vandiedonck and Knight, 2009; Trowsdale, 2011). The extensive polymorphism of the HLA class II genes is studied widely and used to provide a better understanding of the diversity in population structure and the effects of evolutionary processes such as recombination, mutation, genetic drift, demographic history, and natural selection (Meyer et al., 2006; Traherne, 2008; Pierini and Lenz, 2018; Manczinger et al., 2019). For example, there are at least 2,909 HLA-DRB1 alleles distributed world-wide with the official sequences and designations provided by the IMGT/HLA database (Robinson et al., 2020). Consequently, the HLA-DRB1 alleles are genetic markers that are utilized often for the assessment of population structure and differentiation as well as providing information on interpopulation genetic exchange (gene flow) and other demographic events (Di and Sanchez-Mazas, 2011; Sanchez-Mazas et al., 2013, 2017; Sanchez-Mazas and Meyer, 2014; Gonzalez-Galarza et al., 2020). Moreover, the HLA-DRB1 alleles present intracellular or exogenous antigen peptides to CD4+ T cells that trigger and regulate the downstream immune responses to defend against pathogen invasion (Chaplin, 2010). Therefore, this highly polymorphic genomic marker might reveal changes associated with pathogen-mediated pressure on highly heterogenous and diverse populations (Sun et al., 2015; Weiskopf et al., 2016). In addition to polymorphic HLA class II genes, the MHC class II region has a number of polymorphic Alu insertions (POALINs) that are informative population ancestral lineage markers. They are insertion/deletions (either present or absent) at integration sites, which carry characteristic alleles or haplotypes inherited from different ancestral populations (Bennett et al., 2004; Kulski and Dunn, 2005; Ray et al., 2007). Alu retroelements (short interspersed nuclear elements) are among the class of genomic repetitive DNA elements that first appeared in primates about 65 million years ago and then amplified by retrotransposition to the present estimated one million copies per human genome (Lander et al., 2001; Batzer and Deininger, 2002). POALINs are useful lineage and evolutionary genetic markers for studying the origin and genomic diversity of human populations because (1) their allelic frequency distributions vary significantly among geographically different human populations (Deininger and Batzer, 1999; Jorde et al., 2000; Watkins et al., 2001), and (2) they have an inherited identity by descent arising from a known initial ancestral state (no Alu insertion), whereby their presence and/or absence define the ancestral lineages within a population (Antunez-de-Mayolo et al., 2002). Some MHC Alu family members were used previously as evolutionary molecular markers to infer the ancestral duplication history of HLA class I and class II gene copies (Mnukova-Fajdelova et al., 1994; Svensson et al., 1996; Kulski et al., 1999, 2000). Also, several studies reported on the frequencies and distribution of human-specific POALIN loci within the HLA class I region and on their inferred haplotypic associations with HLA-A, -B and -C loci in different populations (Dunn et al., 2002, 2003, 2005a,b, 2007; Yao et al., 2009, 2010; Kulski et al., 2011, 2019; Mastana et al., 2017; Singh et al., 2019). These associations reflect in part the different haplotypic structures of the MHC class I and class II regions and the linkage of multiple polymorphic loci, especially when extended over long stretches (1–3 Mb) of conserved genomic sequences in human populations known as ancestral haplotypes (Dawkins et al., 1999) or conserved extended haplotypes (Alper et al., 2006; Larsen et al., 2014). Although comparative DNA sequence analysis of the entire MHC genome region between two homozygous HLA haplotypes has indicated the presence of POALIN within the MHC class II region (Stewart et al., 2004), five human-specific POALIN (AluDPB2, AluDQA2, AluDQA1, AluDRB1, and AluORF10) frequencies at five loci in the MHC class II genomic region were determined previously only for Japanese, Australian Caucasians (Kulski et al., 2010) and Chinese Han in Yunnan province (Shi et al., 2014) populations. By comparing the data of the MHC class II five-loci POALINs in Chinese Han with Japanese and Caucasian data, marked differences were observed between the three ethnic groups at the allelic or haplotypic levels. In addition, each POALIN was in significant linkage disequilibrium (LD) and/or haplotypically associated (Kulski et al., 2020, 2021) with a variety of HLA-DRB1 alleles in Chinese Han in Yunnan province (Shi et al., 2014). These results showed that POALINs whether investigated alone or together with the HLA class II alleles are informative genetic markers for the identification of allele and haplotype lineages and variations within the same and/or different populations. Beside the Chinese speaking Han majority, there are 55 officially recognized minority ethnic populations of China, which contribute to about 8% of the overall Chinese population and provide abundant genetic resources for POALIN–HLA inferred haplotype studies (Yao et al., 2010). The minority ethnic groups living in the south and southwest of China can be traced back to three major ancient groups: Di-Qiang, Bai-Pu, and Bai-Yue that speak the Tibeto-Burman, Mon-Khmer and Daic language subfamilies, respectively (Table 1); whereas in the northwest of China, most ethnic groups speak the language of the Mongolian and Tujue Manchu-Tungusic subfamily, which is the Altaic language family (Guo, 2000). Although the anthropological, cultural and linguistic characteristics of some of these ethnic populations have been studied in detail (You, 1994; Guo, 2000; Chu et al., 2006), there are few published comparative investigations on the genetic diversity of these populations by genome-wide sequencing or genotyping methods (Di and Sanchez-Mazas, 2011). Therefore, the analyses of robust and reproducible genetic markers such as the POALINs and HLA-DRB1 alleles in small and isolated ethnic minority remains an important task to better understand the human genome and its genetic variability throughout the world.
TABLE 1

Geographic and language information for the 12 minority ethnic populations sampled in the current study.

Ethnic groupSample sizeSample locationAncient tribeLanguage familyLanguage subfamilyPopulation size
Hani149Jingha village, Jinghong County, Yunnan Province, Southwest ChinaDi-Qiang (dq)Sino-TibetanTibeto-Burman (TB)1,661,000
Jinuo75Jinuo Mountains Area, Jinghong City, Xishuangbanna Prefecture, Yunnan Province, Southwest ChinaDi-Qiang (dq)Sino-TibetanTibeto-Burman (TB)23,143
Lisu79Fugong and Gongshan Counties, Nujiang Municipality, Yunnan Province, Southwest ChinaDi-Qiang (dq)Sino-TibetanTibeto-Burman (TB)703,000
Nu82Fugong and Gongshan Counties, Nujiang Municipality, Yunnan Province, Southwest ChinaDi-Qiang (dq)Sino-TibetanTibeto-Burman (TB)37,523
Jingpo95Dehong Dai-Jingpo Autonomous Prefectures, Yunnan Province, Southwest ChinaDi-Qiang (dq)Sino-TibetanTibeto-Burman (TB)147,828
Bulang109Bulang Mountain Area, Menghai County, Yunnan Province, Southwest ChinaBaipu (bp)Austo-AsiaticMon-Khmer (MK)120,000
Wa109Canyuan and Ximeng Counties, Yunnan Province, Southwest ChinaBaipu (bp)Austo-AsiaticMon-Khmer (MK)430,000
Dai121Xishuangbanna Dai Autonomous Prefecture, Yunnan Province, Southwest ChinaBaiyue (by)Sino-TibetanDaic (D)1,261,000
Maonan78Xianan village of Huangjiang County, Guangxi Province, Southern ChinaBaiyue (by)Sino-TibetanDaic (D)101,000
Zhuang101Tiandeng County, Nanning City, Guangxi Zhuang Autonomous Province, Southern ChinaBaiyue (by)Sino-TibetanDaic (D)16,926,000
Tu110Huzu County, Qinghai Province, Northwest ChinaMongolian (m)AltaicMongolian (M)290,000
Yugur93Sunan County, Gansu Province, Northwest ChinaMongolian (m)AltaicTujue (T)14,378
Geographic and language information for the 12 minority ethnic populations sampled in the current study. The aim of present study was to elucidate the inferred haplotypic association between the MHC POALINs and classical HLA class II alleles by determining (1) genetic structures of the five MHC class II POALIN dimorphisms and HLA-DRB1 allele and haplotype frequencies in 12 minority ethnic populations in China, and (2) correlations between the genetic diversity and the four language families of these populations (Table 1). Among these 12 minority ethnic populations, 8 of them settled in Yunnan province together with the Han people (Han-Yunnan). The Han-Yunnan, speaking Chinese of the Sino-Tibetan language family, migrated from the northern region by various routes and at different times to Yunnan province and exhibited genetic characteristics of both northern and southern Chinese groups (Shi et al., 2006). Thus, we included the published data of Han-Yunnan, Japanese and Caucasians as reference populations in order to compare and correlate the genetic differentiation of the HLA-DRB1 alleles and the five POALINs between the populations and the language families by using the DA genetic distance measure in phylogeny and principal component analysis (PCA).

Materials and Methods

Ethics Statement

This study was approved by the Committee on the Ethics of Institute of Medical Biology, Chinese Academy of Medical Sciences, the batch number is YIKESHENGLUNZI [2012]12. Moreover, the protocol employed by this investigation was in accordance with the principles expressed in the Helsinki Declaration of 1975, which was revised in 2008. Written informed consents were obtained from each participant.

Subjects and Samples

A total of 1,201 unrelated individuals were recruited from 12 Chinese minority ethnic populations in China (Figure 1). The geographic location, sample size of each population, the language family to which they belong, and the ancient groups from which they originated are listed in Table 1. These populations are descended from four ancient Chinese groups and belong to four different language subfamilies (Guo, 2000; Yao et al., 2010; Di and Sanchez-Mazas, 2011) as outlined in the introduction and Table 1. The geographic origin, nationalities, and pedigree (unrelated through at least three generations) of each individual were ascertained before sampling.
FIGURE 1

The geographic locations of the 12 Chinese ethnic populations in China. The colored labeled boxes represent the ancient tribe, language family and subfamily for each population listed in Table 1. Yellow represent Di-Qiang, Sino-Tibetan, Tibeto-Burman. Green represent Baipu, Austo-Asiatic, Mon-Khmer. White represent Baiyue, Sino-Tibetan, Daic. Orange represent Mongolian, Altaic, Mongolian. Blue represent Mongolian, Altaic, Tujue.

The geographic locations of the 12 Chinese ethnic populations in China. The colored labeled boxes represent the ancient tribe, language family and subfamily for each population listed in Table 1. Yellow represent Di-Qiang, Sino-Tibetan, Tibeto-Burman. Green represent Baipu, Austo-Asiatic, Mon-Khmer. White represent Baiyue, Sino-Tibetan, Daic. Orange represent Mongolian, Altaic, Mongolian. Blue represent Mongolian, Altaic, Tujue.

Genomic DNA and HLA-DRB1 Typing

Genomic DNA was extracted from peripheral lymphocytes using a QIAamp Blood Kit (Qiagen, Hilden, Germany). DNA samples were quantified with a NanoDrop ND-1000 spectrophotometer (NanoDrop Technologies, Wilmington, WI, United States) and adjusted to a concentration of 20 ng/L. The HLA- DRB1 genes were genotyped using a WAKFlow HLA typing kit (Wakunaga, Hiroshima, Japan) as in previous studies (Ogata et al., 2007; Shi et al., 2008, 2010a,b, 2011; Yao et al., 2012; Tao et al., 2020), which is based on polymerase chain reaction-sequence specific oligonucleotide probes (PCR-SSOP) coupled with multiple analyte profiling (xMAP) technology (Luminex System).

Alu and PCR Assay

The sense and antisense primers used for the PCR of the POALINs located in MHC II regions were previously reported (Kulski et al., 2010; Shi et al., 2014). As some of the previously published primers used for the PCR of the POALINs located in MHC II region have mutations in the Chinese Han in Yunnan province, new sense and antisense primer pairs were designed and used for the PCR of five POALINs located in the MHC II region (Supplementary Table 1). Supplementary Figure 1 shows a map of the locations of the five POALINs with the HLA class II regions of the MHC on chromosome 6p21.3. The PCR products were analyzed according to the fragments of different sizes by the presence or absence of an electrophoretic specific band in 2% agarose gel stained with ethidium bromide and visualized by ultraviolet light. The Alu-PCR methods clearly differentiate between an insertion and absence of insertion in heterozygous individuals based on distinctly different sized PCR products as shown in Supplementary Figure 2. The POALIN alleles are dimorphic structures whereby the absence of the Alu insertion at the Alu locus is the Alu∗1 allele and the presence of the Alu insertion is the Alu∗2 allele. The overall frequencies of the Alu∗2 (insertion) allele at each of the five loci were estimated from the genotypes as described below in the statistical section.

Allele Linkage Controls for Assessment of HLA-DRB1 Allele and POALIN Associations

To better assess the haplotypic associations between the POALINs and the HLA-DRB1 alleles, we examined their sequence linkages in 95 different MHC haplotype sequences (Kulski et al., 2021) that were sequenced, partially annotated and assembled from HLA-homozygous cell lines by Norman et al. (2017). The FASTA files of the 95 MHC class I, II and III genomic sequences were downloaded from the archives at NCBI BioProject with the accession number PRJEB6763[1] and submitted to the RepeatMasker webserver[2] for output files of annotated members of the interspersed repetitive DNA families, their locations in the sequence and their relative similarity or identity in comparison to reference sequences of SINEs, LINEs, LTRs, ERVs, DNA elements, small RNA, and simple repeats (Kulski et al., 2021). The five MHC class II POALINs were easily identified within the RepeatMasker outputs on the basis of their location and flanking sequences and/or other repeats as previously described (Kulski et al., 2010). The HLA-DRB1 alleles for all of the 95 cell line sequences were determined and reported by Norman et al. (2017). Supplementary Table 2 is a summary of the sequence linkages between the 5 POALIN and the HLA-DRB1 alleles that were determined in 90 of the sequenced haplotypes (Kulski et al., 2021). These were used as a comparative reference control to assist with a better interpretation of our results obtained for our haplotypic association analyses in 15 different populations.

Statistical Analysis

The frequencies of five POALINs were calculated from the genotyping data by the direct-counting method. For each locus, Hardy-Weinberg’s equilibrium was assessed using the Guo and Thompson method (Guo and Thompson, 1992). The haplotypes were estimated by the maximum-likelihood method using the Pypop software (Lancaster et al., 2003, 2007). Pairwise LD of POALINs and HLA allele were calculated using the SHEsis software[3] (Shi and He, 2005). The percentage association between a POALIN insertion and an HLA allele was calculated as the percentage of the total HLA allele frequency that was associated with the presence of the POALIN insertion at an inferred HLA class II gene/POALIN haplotype using the haplotype frequency data generated by the Pypop software (Lancaster et al., 2003, 2007). Percentage associations between HLA allele and POALIN insertion frequencies were considered to be very strong if between 80 and 100%, strong if over 50% and less than 79%, moderate if between 20 and 50%, and low or absent if less than 20% (Kulski et al., 2010; Shi et al., 2014). The differences in significance between the POALIN and its haplotype frequencies were determined by a contingency test (Fisher’s exact test). Bonferroni correction was used for multiple testing. Statistical significance was defined at the 5% level.

Phylogenetic Analysis

Based on the POALIN allele, HLA-DRB1 allele frequencies and DRB1/AluDRB1 haplotypes of the different population, the DA was calculated using the Dispan software (Nei, 1973, 1978). The Mega 7.0 software was used to reconstruct the neighbor-joining (NJ) trees according to the DA (Tamura et al., 2007). Principal component analysis (PCA) was also performed based either on POALIN allele, or HLA-DRB1 frequencies using SPSS 16.0 software. POALIN allele and HLA-DRB1 allele frequencies were obtained from additional Japanese, Caucasian and Han-Yunnan populations (Kulski et al., 2010; Shi et al., 2014) for comparative phylogenetic analysis with the frequencies obtained for the 12 Chinese ethnic populations in this study.

Results

HLA-DRB1 Allele Frequencies

We summarized the HLA-DRB1 allele frequencies in 15 populations in Table 2 according to previous studies (Shi et al., 2006, 2008, 2010a,b, 2011; Ogata et al., 2007; Kulski et al., 2010; Yao et al., 2012; Tao et al., 2020). There were 57 different HLA-DRB1 alleles for the 15 populations ranging from 12 alleles in the Jinuo to 39 alleles in the Han-Yunnan. Only six HLA-DRB1 alleles were present in all 15 populations and these were DRB1∗08:03, DRB1∗09:01, DRB1∗12:02, DRB1∗14:01, DRB1∗15:01, and DRB1∗15:02. There were sixteen low frequency, unique, solitary alleles for six populations; DRB1∗03:05 (0.003), DRB1∗08:27 (0.003), DRB1∗09:09 (0.003), DRB1∗11:31 (0.003), DRB1∗11:52 (0.003), DRB1∗12:19 (0.003), DRB1∗13:28 (0.003), DRB1∗14:32 (0.005) and DRB1∗14:35 (0.003) in Han-Yunnan, DRB1∗01:03 (0.011), DRB1∗08:10 (0.003) and DRB1∗11:03 (0.017) in Caucasians, DRB1∗14:06 (0.01) in Japanese, DRB1∗14:18 (0.015) in Zhuang, DRB1∗14:25 (0.014) in Bulang, and DRB1∗15:11 (0.011) in Jingo. The successive highest frequencies of DRB1 alleles were DRB1∗16:02 in Dai, and DRB1∗15:01 in Zhuang, and DRB1∗14:01 and DRB1∗16:02 in the Maonan. In addition, DRB1∗12:02 was the most frequent in the Hani, Jinuo, Lisu, Nu, Jingo, Bulang, Wa, Maonan and Han-Yunnan ranging from 16% in Maonan to 55% in Bulang. The highest allelic frequency in the two Mongolian groups, Tu and Yugur, was DRB1∗09:01 (12.7% and 13.4%, respectively) as same as Japanese (20%).
TABLE 2

HLA-DRB1 allele frequencies in 15 populations.

No.DRB1 alleleHaniJinuoLisuNuJingpoBulangWaDaiMaonanZhuangTuYugurHan-YunnanJapaneseCaucasians
(n = 149)(n = 75)(n = 79)(n = 82)(n = 95)(n = 109)(n = 109)(n = 121)(n = 78)(n = 101)(n = 110)(n = 93)(n = 186)(n = 100)(n = 174)
1DRB1*01:010.0130.0130.0060.0260.0640.0480.0240.080.095
2DRB1*01:020.0060.014
3DRB1*01:030.011
4DRB1*03:010.0070.0180.0580.0450.1290.0230.0650.0350.126
5DRB1*03:050.003
6DRB1*04:010.0060.0050.0270.0430.0030.020.109
7DRB1*04:020.0050.0030.006
8DRB1*04:030.0170.040.0180.0050.0180.0140.0120.0060.0050.0380.0160.0350.006
9DRB1*04:040.0130.0050.0160.0030.043
10DRB1*04:050.0380.0240.0260.0230.0280.0540.090.0150.0680.0540.0590.135
11DRB1*04:060.0060.030.0110.0050.0140.0210.0060.0150.0140.0220.0270.025
12DRB1*04:070.0060.010.02
13DRB1*04:080.0110.010.003
14DRB1*04:100.0190.0060.0090.0090.0080.01
15DRB1*07:010.0030.0130.0060.0120.0210.0050.0550.0080.0730.0650.0620.144
16DRB1*08:010.0030.0160.014
17DRB1*08:020.0060.0120.0050.0220.0080.035
18DRB1*08:030.030.0870.0760.110.0470.0140.0370.0040.0710.0150.0590.0270.0890.07
19DRB1*08:040.0110.003
20DRB1*08:090.0040.005
21DRB1*08:100.003
22DRB1*08:270.003
23DRB1*09:010.0270.0070.0510.0240.0420.0140.0050.1120.1030.050.1270.1340.1590.20.023
24DRB1*09:090.003
25DRB1*10:010.0030.0160.0080.020.0450.0540.0030.0050.006
26DRB1*11:010.0130.0470.0440.0310.0210.0050.0210.0260.050.1050.0480.0540.010.06
27DRB1*11:030.017
28DRB1*11:040.0050.0050.0080.006
29DRB1*11:060.0060.0370.0160.050.0180.004
30DRB1*11:310.003
31DRB1*11:520.003
32DRB1*12:010.0030.0060.0180.0770.0540.0110.020.026
33DRB1*12:020.3220.4070.2090.2380.4530.550.3260.0990.160.0450.0860.0540.1640.025
34DRB1*12:190.003
35DRB1*13:010.0190.0120.0730.0050.0140.0430.0110.063
36DRB1*13:020.0130.0060.0110.0050.0250.0150.0180.0380.030.049
37DRB1*13:030.0070.0120.0050.0370.0640.0250.006
38DRB1*13:280.003
39DRB1*14:010.2080.1530.1080.140.0260.0320.0140.1120.1220.1390.0450.0430.0270.040.017
40DRB1*14:020.0090.003
41DRB1*14:030.0180.0140.0050.0080.03
42DRB1*14:040.1180.120.0120.0210.0050.0690.0330.0060.0050.005
43DRB1*14:050.020.0270.0120.0130.0050.0270.0160.0080.015
44DRB1*14:060.01
45DRB1*14:070.0190.003
46DRB1*14:100.0170.013
47DRB1*14:180.015
48DRB1*14:250.014
49DRB1*14:320.005
50DRB1*14:350.003
51DRB1*15:010.0440.040.0570.0370.0580.0730.0920.0870.0960.1680.0180.0650.0730.050.103
52DRB1*15:020.070.1670.1080.0850.10.1240.0870.1360.0580.1390.0410.0220.0350.1250.011
53DRB1*15:040.0670.0070.0380.0980.0630.0370.1330.008
54DRB1*15:110.011
55DRB1*15:150.0060.005
56DRB1*16:010.0060.0090.014
57DRB1*16:020.0130.0060.0210.0230.0050.1530.1220.1490.0090.0110.0220.01
HLA-DRB1 allele frequencies in 15 populations.

POALIN Allele Frequencies and Hardy-Weinberg’s Equilibrium (HWE)

The five POALIN allele frequencies and the genotype counts in 12 Chinese minority populations, shown in Table 3, were compared statistically to those reported previously for the Japanese, Australian Caucasians (Kulski et al., 2010) and Chinese Han in Yunnan (Shi et al., 2014). The frequencies of five POALINs in 12 Chinese minority populations ranged from 0.359 to 0.514 (AluDPB2), 0 to 0.127 (AluDQA2), 0.777 to 0.995 (AluDQA1), 0.1 to 0.455 (AluDRB1) and 0.084 to 0.368 (AluORF10). The differences in significance between two populations for each POALIN frequency are shown in Supplementary Table 3.
TABLE 3

The MHC POALIN allelic frequencies and genotype counts at five loci in 15 populations.

Alu allele or genotypeAlu allele frequency or genotype counts in 15 populations (n)

Hani (n = 149)Jinuo (n = 75)Lisu (n = 79)Nu (n = 82)Jingpo (n = 95)Bulang (n = 109)Wa (n = 109)Dai (n = 121)Maonan (n = 78)Zhuang (n = 101)Tu (n = 110)Yugur (n = 93)Han-Yunnan (n = 186)aJapanese (n = 100)bCaucasians (n = 174)b
Alu alleleFrequency
AluDPB2*10.5070.6200.5440.5850.7210.5830.5600.5450.6410.5740.4860.5270.5400.5000.546
AluDPB2*20.4930.3800.4560.4150.2790.4170.4400.4550.3590.4260.5140.4730.4600.5000.454
AluDQA2*10.9970.9870.9810.9940.8950.9770.9950.9591.0000.9750.8730.9090.9970.9900.790
AluDQA2*20.0030.0130.0190.0060.1050.0230.0050.0410.0000.0250.1270.0910.0030.0100.210
AluDQA1*10.1240.0670.1840.1950.0320.0050.1060.0500.0260.1340.2230.0970.3120.4320.503
AluDQA1*20.8760.9330.8160.8050.9680.9950.8940.9500.9740.8660.7770.9030.6880.5680.497
AluDRB1*10.7890.7800.8040.7680.7530.7390.6830.6320.7050.5450.9000.8550.8200.7750.744
AluDRB1*20.2110.2200.1960.2320.2470.2610.3170.3680.2950.4550.1000.1450.1800.2250.256
AluORF10*10.9160.9000.8730.8960.8260.8300.8620.6320.6410.6340.8590.8820.8250.8550.764
AluORF10*20.0840.1000.1270.1040.1740.1700.1380.3680.3590.3660.1410.1180.1750.1450.236
Alu genotypeCounts
AluDPB2 1, 138292429504435373430282357
AluDPB2 1, 275353838373952583256515287
AluDPB2 2, 23611171582622261215311842
AluDQA2 1, 11487477817610410811378968481185
AluDQA2 1, 2101118516052471
AluDQA2 2, 20110100200250
AluDQA1 1, 11431315305411321528
AluDQA1 1, 2943201134217860
AluDQA1 2, 212668636592108911137587828098
AluDRB1 1, 1954651495259504537278968124
AluDRB1 1, 245252528394349633656202357
AluDRB1 2, 29435471013518125
AluORF10 1, 11286159676576804634398275128
AluORF10 1, 217132013272928613250251451
AluORF10 2, 24102341141212347
The MHC POALIN allelic frequencies and genotype counts at five loci in 15 populations. Of all the five POALIN loci, the AluDQA1 locus showed a significant departure (P < 0.01 after Bonferroni’s correction) from HWE in 10 minority populations, which were the Hani, Jinuo, Lisu, Nu, Jingpo, Wa, Dai, Zhuang, Tu and Yugur (Supplementary Table 4). The data were similar to the results of Han in Yunnan (Shi et al., 2014) and Japanese (Kulski et al., 2010) that also showed that the AluDQA1 locus was not consistent with the HWE. The AluDPB2 locus showed a significant departure (P < 0.01 after Bonferroni’s correction) from HWE in the Bulang; whereas the AluDQA2 locus showed a significant departure (P < 0.01 after Bonferroni’s correction) from HWE in the Jinuo and Yugur.

POALIN Haplotype Frequencies

Table 4 shows the POALIN haplotypes for 12 Chinese minority populations, the Chinese Han in Yunnan (Shi et al., 2014), Japanese and Caucasians (Kulski et al., 2010). There were 30 five-locus POALIN haplotypes inferred in all 15 populations, with 11 in Hani, 11 in Jinuo, 15 in Lisu, 12 in Nu, 14 in Jingpo, 10 in Bulang, 12 in Wa, 16 in Dai, 11 in Maonan, 16 in Zhuang, 19 in Tu, 18 in Yugur, 14 in Han-Yunnan, 14 in Japanese and 23 in Caucasians. All haplotypes were named H1-H30 and only five haplotypes were found in all 15 populations. These were the ancestral null H1 with no Alu insertions (AluDPB2∗1: AluDQA2∗1: AluDQA1∗1: AluDRB1∗1: AluORF10∗1), and various haplotypes with one to three Alu insertions; H5 (AluDQA1∗2), H7 (AluDQA1∗2: AluDRB1∗2), H21 (AluDPB2∗2: AluDQA1∗2) and H23 (AluDPB2∗2: AluDQA1∗2: AluDRB1∗2). The H5 (AluDQA1∗2) and H21 (AluDPB2∗2: AluDQA1∗2) haplotypes were predominant in all 12 minority populations at frequency ranges of 0.144–0.433 and 0.158–0.352, respectively, which was the same as that for the Han-Yunnan, Japanese and Caucasians. There were seven haplotypes that were specific to only one particular population. These were three two-insertion haplotypes, three three-insertion haplotypes, and one five-insertion haplotype; H4 (AluDRB1∗2: AluORF10∗2) in the Japanese, H10 (AluDQA2∗2: AluORF10∗2), H12 (AluDQA2∗2: AluDRB1∗2: AluORF10∗2), H20 (AluDPB2∗2: AluDRB1∗2: AluORF10∗2), and H26 (AluDPB2∗2: AluDQA2∗2: AluDRB1∗2) in Caucasians, H11 (AluDQA2∗2: AluDRB1∗2) in Hani, and H30 (AluDPB2∗2: AluDQA2∗2: AluDQA1∗2: AluDRB1∗2: AluORF10∗2) in the Tu. The differences in significance between two populations for each haplotype frequency are shown in Supplementary Table 5.
TABLE 4

Haplotype frequencies of POALINs at five loci in 15 populations.

5-loci Alu haplotypes
Haplotype frequencies in 15 populations
No.Alu DPB2Alu DQA2Alu DQA1Alu DRB1Alu ORF10HaniJinuoLisuNuJingpoBulangWaDaiMaonanZhuangTuYugurHan-YunnanaJapanesebCaucasiansb
H1111110.0160.0400.0590.0630.0050.0050.0220.0090.0070.0430.0750.0380.1020.1760.140
H2111120.0060.0100.0110.0090.0150.0560.0120.0050.031
H3111210.0070.0060.0060.0160.0060.0140.0210.040
H4111220.006
H5112110.3320.3860.3350.2830.4330.3110.3430.2290.2930.1440.2210.3050.2640.1260.153
H6112120.0100.0500.0370.0630.0200.0620.1550.0310.0400.0520.0390.0510.023
H7112210.0730.1070.0710.1670.0810.1560.1170.0830.0900.1410.0430.0720.0470.1060.021
H8112220.0750.0740.0430.0920.0250.0490.1250.0960.1400.0100.0040.0400.033
H9121110.0060.0110.0160.0180.0100.080
H10121120.0000.004
H11121210.0030.002
H12121220.005
H13122110.0060.0060.0520.0060.0100.0500.0200.0030.013
H14122120.0040.0030.034
H15122210.0130.0040.001
H16122220.0100.0230.0060.005
H17211110.0940.0210.0990.1080.0110.0700.0100.0210.0870.0280.1590.0910.091
H18211120.0060.0080.0050.0150.0090.0080.0080.0610.015
H19211210.0040.0050.0520.053
H20211220.004
H21212110.3470.3190.2380.2050.1560.3520.2060.2040.1950.2060.2970.3200.1790.1980.095
H22212120.0080.0360.0350.0080.0140.0610.0360.0170.0510.0340.0420.0280.036
H23212210.0400.0140.0450.0580.0450.0070.1040.0660.0460.0640.0130.0270.0510.0360.011
H24212220.0090.0120.0300.0190.0510.0420.0890.0630.1040.0180.0230.075
H25221110.0090.0100.060
H26221210.005
H27221220.0050.007
H28222110.0060.0430.0200.0050.0320.035
H29222210.0110.003
H30222220.017
Haplotype frequencies of POALINs at five loci in 15 populations. The two most predominant haplotypes in all 15 populations were H5 (AluDPB2∗1: AluDQA2∗1: AluDQA1∗2: AluDRB1∗1: AluORF10∗1) and H21 (AluDPB2∗2: AluDQA2∗1: AluDQA1∗2: AluDRB1∗1: AluORF10∗1), both with the AluDQA1 insertion. Haplotype H6 (AluDPB2∗1: AluDQA2∗1: AluDQA1∗2: AluDRB1∗1: AluORF10∗2) differentiated the Maonan from the other populations (P < 0.01 after Bonferroni’s correction), whereas haplotype H7 (AluDPB2∗1: AluDQA2∗1: AluDQA1∗2: AluDRB1∗2: AluORF10∗1) differentiated the Caucasians from the other populations except for the Tu and Han-Yunnan (P < 0.01 after Bonferroni’s correction). Also, haplotype H8 (AluDPB2∗1: AluDQA2∗1: AluDQA1∗2: AluDRB1∗2: AluORF10∗2) differentiated the Caucasians from the other populations except from the Tu (P < 0.01 after Bonferroni’s correction). The haplotype H18 (AluDPB2∗2: AluDQA2∗1: AluDQA1∗1: AluDRB1∗2: AluORF10∗2) frequency was different between the Japanese and other populations but not from the Jinuo, Dai and Maonan (P < 0.01 after Bonferroni’s correction). On the other hand, haplotype H19 (AluDPB2∗2: AluDQA2∗1: AluDQA1∗1: AluDRB1∗2: AluORF10∗1) was observed only in four populations, with a significant difference obtained between Hani/Han-Yunnan and Japanese/Caucasians (P < 0.01 after Bonferroni’s correction).

LD Analysis and Percentage Haplotypic Association Between POALINs and HLA Alleles

D′ values for global LD between the five POALINs were calculated in twelve ethnic populations and are shown in Figure 2. LD values between the Alu loci were variable between the ethnic populations ranging from the absence of strong LD (D′ < 54%) between any of the Alu in the Yugur and Tu Mongolian populations to a strong LD (D′ > 0.8) between four or five Alu in the Jinuo, Nu, Bulang and Wa. The Hani, Lisu and Jingpo had strong LD (D′ > 0.8) between two or three Alu insertions, whereas the Dai, Maonan and Zhuang of the ancient Baiyue tribe and the Daic subfamily language had only two Alu in strong LD.
FIGURE 2

LD estimations (D′) among five POALINs within MHC II region for 12 Chinese ethnic populations.

LD estimations (D′) among five POALINs within MHC II region for 12 Chinese ethnic populations. Supplementary Table 6 shows the frequency of HLA-DRB1 alleles and class II POALINS and that the percentage associations between these POALIN and particular HLA-DRB1 alleles were at very high (80–100%), high (>50–79%), moderate (20–50%) and low (<20%) percentages. For example, all of the 19 HLA-DRB1 alleles in the Hani were associated with four of five of the Alu insertions at high to very high percentages: 16 alleles (except for HLA-DRB1∗01:01, -DRB1∗08:03 and -DRB1∗10:01) associated at 67.3–100% with AluDQA1, 8 alleles (HLA-DRB1∗01:01, -DRB1∗04:03, -DRB1∗08:03, -DRB1∗09:01, -DRB1∗12:01, -DRB1∗14:01, -DRB1∗14:04 and -DRB1∗16:02) associated at 54.9–100% with AluDPB2, 4 alleles (HLA-DRB1∗01:01, -DRB1∗08:01, -DRB1∗15:02 and -DRB1∗15:04) associated at 57.1–100% with AluDRB1, and there was 100% association between AluDQA2 and HLA-DRB1∗10:01, but at very low frequency (0.00336). The AluORF10 was associated with the Hani HLA-DRB1 alleles only at low to moderate levels. Supplementary Table 7 shows a summary of the comparative percentage association between HLA-DRB1 alleles and the Alu class II POALINs in 12 ethnic populations (this study), Chinese Han in Yunnan (Shi et al., 2014), Japanese and Caucasians (Kulski et al., 2010) from previous studies. Overall, there was a strong similarity of haplotypic associations between AluDQA1 and HLA-DRB1 alleles in all fifteen populations. Table 5 shows a summary of the percentage association between HLA-DRB1 alleles and AluDRB1. Overall, all the populations except for the Hani and the Dai have 83 to 100% association between the AluDRB1 insertion and HLA-DRB1∗15 and HLA-DRB1∗16. In comparison, the AluDRB1 insertion was linked to six of six homozygous cell lines with HLA-DRB1∗01, seven of seven cell lines with -DRB1∗16, 10 of 11 cell lines with -DRB1∗15 and to none of the other 66 cell lines with nine other DRB1 lineage alleles (Supplementary Table 2). For the other Alu insertions, HLA-DRB1∗09 was not found in the Wa, but it had a moderate to very strong association (51–100%) with AluDPB2 in thirteen populations and a low association (31.7%) in the Lisu. For a comparison of the haplotypic associations with actual genomic sequence linkages, Supplementary Table 2 shows the percentage linkage between these five POALIN with HLA-DRB1 alleles detected in the MHC class II haplotype sequences of 90 homozygous cell-lines (Kulski et al., 2021). Because of ancestral recombination at sites between various Alu loci and the DRB1 allelic loci, the linkages detected in the cell lines were not present in all the different Chinese ethnic populations, although the general trends are maintained between and within populations.
TABLE 5

Percentage association between HLA-DRB1 alleles and AluDRB1.

DRB1 alleleAverage of 10 ethnic groupsaHaniDaiHan-YunnanbJapanesecCaucasianc
DRB1*01646810075100
DRB1*0325593
DRB1*04820925
DRB1*071150
DRB1*08105065
DRB1*09133
DRB1*1010050
DRB1*1117
DRB1*122294
DRB1*1320415
DRB1*14413539
DRB1*159554498683100
DRB1*1694502687100100
Percentage association between HLA-DRB1 alleles and AluDRB1.

Phylogenetic Trees and PCA Plots

To compare the diversity of these ethnic populations, we constructed phylogenetic trees (Figure 3) and PCA plots (Figure 4) based on POALIN alleles, HLA-DRB1 alleles and DRB1-AluDRB1 haplotype frequencies. The topology for the NJ tree constructed using the DA of POALIN alleles (Figure 3A), revealed two distinct clusters: (1) the Dai, Zhuang and Maonan of the Daic subfamily in the Sino-Tibetan language family, and (2) the Jingpo of the Tibeto-Burman subfamily in the Sino-Tibetan language family with the Bulang stemming from the Wa, which are both part of the Mon-Khmer subfamily in the Austo-Asiatic language family. A third cluster was the stepwise grouping of Lisu, Nu, Hani and Jinuo of the Tibeto-Burman with the Mongolian Yugur of the Tujue subfamily in the Altaic language family inserted between the Hani and the Jinuo. The Han from Yunnan province grouped at the lower extremity of the 12 Chinese minority ethnic groups and away from the Japanese and the Caucasians that had grouped at the opposite end of the tree to that of the Daic cluster.
FIGURE 3

Neighbor-joining trees. (A) Neighbor-joining tree based on DA genetic distance from five POALIN allele frequencies. (B) Neighbor-joining tree based on HLA-DRB1 allele frequencies. (C) Neighbor-joining tree based on the DRB1/AluDRB1 haplotype frequencies. The colored labeled boxes represent the ancient tribe, language family and subfamily for each population listed in Table 1. Yellow represent Di-Qiang, Sino-Tibetan, Tibeto-Burman. Green represent Baipu, Austo-Asiatic, Mon-Khmer. White represent Baiyue, Sino-Tibetan, Daic. Orange represent Mongolian, Altaic, Mongolian. Blue represent Mongolian, Altaic, Tujue.

FIGURE 4

Principal component analysis (PCA). (A) PCA based on five POALIN allele frequencies. Contributions of the first and second components were 43.53% and 25.96%, respectively. (B) PCA based on HLA-DRB1 allele frequencies. Contributions of first and second components were 58.85% and 13.02%. (C) PCA based on the DRB1/AluDRB1 haplotype frequencies. Contributions of first and second components were 58.61% and 10.35%. The colored dots represent the ancient tribe, language family and subfamily for each population listed in Table 1. Yellow represent Di-Qiang, Sino-Tibetan, Tibeto-Burman. Green represent Baipu, Austo-Asiatic, Mon-Khmer. White represent Baiyue, Sino-Tibetan, Daic. Orange represent Mongolian, Altaic, Mongolian. Blue represent Mongolian, Altaic, Tujue.

Neighbor-joining trees. (A) Neighbor-joining tree based on DA genetic distance from five POALIN allele frequencies. (B) Neighbor-joining tree based on HLA-DRB1 allele frequencies. (C) Neighbor-joining tree based on the DRB1/AluDRB1 haplotype frequencies. The colored labeled boxes represent the ancient tribe, language family and subfamily for each population listed in Table 1. Yellow represent Di-Qiang, Sino-Tibetan, Tibeto-Burman. Green represent Baipu, Austo-Asiatic, Mon-Khmer. White represent Baiyue, Sino-Tibetan, Daic. Orange represent Mongolian, Altaic, Mongolian. Blue represent Mongolian, Altaic, Tujue. Principal component analysis (PCA). (A) PCA based on five POALIN allele frequencies. Contributions of the first and second components were 43.53% and 25.96%, respectively. (B) PCA based on HLA-DRB1 allele frequencies. Contributions of first and second components were 58.85% and 13.02%. (C) PCA based on the DRB1/AluDRB1 haplotype frequencies. Contributions of first and second components were 58.61% and 10.35%. The colored dots represent the ancient tribe, language family and subfamily for each population listed in Table 1. Yellow represent Di-Qiang, Sino-Tibetan, Tibeto-Burman. Green represent Baipu, Austo-Asiatic, Mon-Khmer. White represent Baiyue, Sino-Tibetan, Daic. Orange represent Mongolian, Altaic, Mongolian. Blue represent Mongolian, Altaic, Tujue. The topology of the NJ trees based on HLA-DRB1 allele frequencies and DRB1/AluDRB1 haplotypes were similar to each other (Figures 3B,C) and both revealed two distinct clusters: (1) the Dai, Zhuang and Maonan of the Daic subfamily in the Sino-Tibetan language family, and (2) the Bulang and Wa of the Mon-Khmer subfamily in the Austo-Asiatic language family separated from the Jingpo, Hani, Jinuo, Nu and Lisu group of the Tibeto-Burman subfamily in the Sino-Tibetan language family. The Han-Yunnan population grouped between the Chinese minority populations and the Japanese and at a genetic distance away from the Mongolian Tu and Yugur and the Caucasians. In this regard, the POALIN and HLA-DRB1 allele frequencies both grouped the 13 Chinese ethnic populations into their respective subfamilies and language families. The main exception was that the POALIN frequencies separated the Tu and Yugur at a greater distance from each other (Figure 3A), whereas the HLA-DRB1 allele frequencies placed them more closely together between the Japanese and the Caucasians (Figures 3B,C). The PCA plots for the POALIN alleles (Figure 4A), HLA-DRB1 alleles (Figure 4B) and DRB1-AluDRB1 haplotypes (Figure 4C) showed that the distinct linguistic clusters of the 15 populations in each of four quadrants are similar to those revealed by the NJ trees (Figure 3). These plots have placed the Jingpo closer to the Mon-Khmer subfamily than to the Tibeto-Burman subfamily from which the Jingpo are believed to have originated, and the genetic distance between the Mongolian Tu and Yugur is greater for the POALIN alleles than the HLA-DRB1 alleles and DRB1-AluDRB1 haplotypes. Also, the Caucasians are the genetic outgroup in relation to the 13 Chinese ethnic populations and the Japanese.

Discussion

In this study, we examined the genetic variations of the five POALIN and HLA-DRB1 allele and haplotype frequencies to further elucidate the association between the MHC class II POALIN and the classical HLA-DRB1 allele frequencies in 12 Chinese minority populations. The HLA-DRB1 alleles are used widely and commonly for assessing the genetic structure and differences within and between different populations (Di and Sanchez-Mazas, 2011; Sun et al., 2015; Weiskopf et al., 2016; Gonzalez-Galarza et al., 2020). The frequency of the HLA-DRB1 alleles within the 12 Chinese minority populations were similar to previous reports (Ogata et al., 2007; Shi et al., 2008, 2010b, 2011; Sun et al., 2015; Tao et al., 2020). On the other hand, the previous studies on the distribution and frequency of the MHC class II POALIN dimorphisms were limited to only three populations, the Caucasian, Japanese (Kulski et al., 2010), and Chinese Han in Yunnan (Shi et al., 2014), and this published data provided the three outlying comparative populations for the present study. Therefore, we have provided new data on the POALIN frequencies for 12 Chinese minority populations that were selected for genetic analysis because of their culture, known ancient history and connection to five distinct language subfamilies, the Tibeto-Burman, Mon-Khmer, Daic, Mongolian and Tujue (Table 1). Phylogenetic trees and PCA (Figures 3, 4) show that the Alu insertion dimorphism, HLA-DRB1 alleles and the DRB1-AluDRB1 haplotype diversity are associated strongly with the population characteristics of language family, migration and sociality. The Daic family, including the Dai, Zhuang and Maonan, always clustered closely together based on the POALIN dimorphisms, HLA-DRB1 alleles and HLA DRB1-AluDRB1 haplotypes. The Tibeto-Burman subfamily of the Jinuo, Hani, Lisu and Nu have certain shared population characteristics due to their migration from the north, and therefore are genetically closer to the Yugur and Tu northern populations, which belong to Mongolian tribal family. Surprisingly, the Jingpo from Tibeto-Burman subfamily are genetically closer to the Mon-Khmer family (Bulang and Wa) than to other populations from Tibeto-Burman subfamily probably because these three populations have long lived closely together in the mountains of the western part of Yunnan and have been infected by similar pathogens from the infectious environment. For example, malaria is a serious infectious disease prevalent in China since 2700 BC, and Yunnan Province is a high incidence area of malaria, especially in the border area between China and Myanmar (Cox, 2010; Bi et al., 2013; Diouf et al., 2014). Similarly, the Jinuo and Bulang who live closely together within this same area, also may have undergone high selective pressure from malaria. The five different POALIN dimorphic frequencies provide unique evolutionary and genetic information on the relationships between the 12 Chinese minority populations. The frequencies of AluDPB2, AluDQA2 and AluDQA1 in the Jingpo had significant differences with the other four populations (P < 0.01 after Bonferroni’s correction) of the Tibeto-Burman subfamily. This suggests an expansion of these Alu insertions in the Jingpo people as a consequence of their different population histories or environmental effects. In comparison, the Bulang, a member of the Mon-Khmer family, had the highest POALIN frequency (0.995) for AluDQA1 in all 15 populations. This is the highest and closest to subpopulation genetic fixation for any of the MHC POALIN frequencies in world populations suggesting substantial long term population isolation. The frequencies of AluORF10 were higher in Dai, Maonan, and Zhuang (Daic subfamily in the Sino-Tibetan language family) than in the other nine Chinese minority populations. AluDQA1 was the highest POALIN frequency (0.777 and 0.903, respectively) in the Tu and the Yugur with a significant difference between these two populations (P < 0.01 after Bonferroni’s correction). HLA-DRB1∗09:01 had the strongest association (100%) with AluDQA1, and was the highest frequency (0.127 and 0.134) in the Tu and Yugur, respectively. According to historical records, all the Altaic language speaking groups such as the Tu and the Yugur who speak the Mongolian, Tujue, or Manchu-Tungusic sub-languages originated from the people and places overrun by the Mongol Empire and from the border adjacent to Northeastern China in the 13th century (Guo, 2000; Chu et al., 2006). HLA-DRB1∗12:02 also had strong associations (88.7–100%) with AluDQA1 with a high frequency (0.160–0.550) in eight populations (Hani, Jinuo, Lisu, Nu, Jingpo, Bulang, Wa, and Maonan). It is reported that the distribution of DRB1 allele frequencies for a Mongolian subpopulation in Yunnan was different to a Mongolian population of inner Mongolia and much closer to the Hani population of Yunnan (Sun et al., 2015). They hypothesized that the difference between the two Mongolian populations was due partly to gene flow and pathogen driven selection. We found a large differentiation between two Mongolian populations for the Alu alleles, but not for the HLA-DRB1 alleles. The Alu analysis placed the Mongolian Yugur within a cluster of the Di-Qiang subfamilies and at a substantial distance away from the Mongolian Tu, whereas the DRB1 allele frequencies for the two Mongolian populations placed them closer together at a genetic distance between the Japanese and Caucasians (Figures 3, 4). We attribute this difference between the two Mongolian populations for the Alu analysis mainly to a twofold difference in the AluDQA1∗1 frequencies (Table 3). However, it is possible that the frequencies of particular DRB1 alleles of the two distinct Mongolian populations may have placed them closer together because of pathogen driven selection at that particular individual gene in contrast to the more independent and possibly less effective Alu loci. In this regard, the inheritance of identical by descent or identical by state genomic loci and/or haplotypes may in part be driven by selection, gene flow and various social and geographic factors, but has yet to be defined and investigated using a greater variety of different genomic markers for comparative analyses. Overall, the branching patterns of the interrelationships between the populations and population clusters were similar for the Alu and DRB1 allelic frequencies, although the genetic distances between particular populations were substantially different. Most of these similarities are likely due to the haplotypic characteristics between the Alu dimorphism and the DRB1 alleles (Kulski et al., 2010, 2021), as exemplified in this study with a comparison between the NJ trees of the HLA-DRB1 alleles and HLA-DRB1-AluDRB1 haplotypes (Figure 3). It is clear from this and previous studies that the closer the dimorphic Alu is to the HLA-DRB1 locus the stronger the haplotypic linkage/association and recombination resistance (Kulski et al., 2010, 2011, 2021). This seems to be the case for AluDRB1 that is most strongly associated with HLA-DRB1∗15 and -DRB1∗16 (Table 5) and is located within 14 kb of the HLA-DRB1 locus. In contrast, AluORF6 and AluDP2, which are 233 kb and 536 kb, respectively, from the HLA-DRB1 locus (Supplementary Figure 1), are associated with many different DRB1 alleles possibly because many more recombination events had occurred between their loci. The five genotyped and haplotyped ‘lineage by descent,’ dimorphic Alu described in this study provide clues to the diversity of the MHC class II region of the 12 Chinese minority populations. However, further studies using fully phased genomic sequences of the MHC class II region within these historically small ethnic communities that are still strongly linked together by ancestry, culture and language might provide a better understanding of these POALIN haplotypic associations within the context of human MHC class II diversity, identity by descent (and/or by chance or state), haplotype shuffling and ancestral recombinations (Dawkins et al., 1999; Alper et al., 2006; Larsen et al., 2014). The POALINs in the current study are all members of the young Alu subfamily, with AluDQA1 and AluDRB1 belonging to the AluY subgroup and AluDQA2, AluDPB2 and AluORF10 belonging to the youngest AluYa5 or AluYb8 subgroup (Kulski et al., 2010). AluDQA1 appears to be the oldest of the five POALINs on the basis of having the highest POALIN frequency in the 15 populations (Table 3) and its association with most of the HLA-DRB1 supertypes (Supplementary Table 7). Thus, the AluDQA1 insertion was distributed widely in the Chinese ethnic populations and associated strongly as a haplotype with all or most of the HLA-DRB1 alleles. The frequency of AluDQA2 was higher in the Caucasians than in the Chinese populations or Japanese. The hypothesis that AluDQA2 may have originated in Caucasians (Kulski et al., 2010; Shi et al., 2014) is confirmed by the present study. The frequencies of AluDRB1 were the highest in the Dai, Maonan, Zhuang, which belong to Tibeto-Burman language subfamily. The AluDRB1 with the frequency range from 0.10–0.455 had a strong association with HLA-DRB1∗15 and -DRB1∗16 in most populations. However, there was a significantly lower % association of <55% between the AluDRB1 insertion and HLA-DRB1∗15 or HLA-DRB1∗16 in Hani and Dai compared to the other 11 Chinese ethnic groups including the Han-Yunnan, and the Japanese and Caucasians (Table 5). This could be due to primer mutation with allelic dropout, an AluDRB1 deletion, recombination events, or a high level of interbreeding among members of the population with the HLA-DRB1∗15 or HLA-DRB1∗16 haplotype that was missing the AluDRB1 insertion in the founding group. By comparison, the AluDRB1 insertion is very much limited by linkage (or association) to the HLA-DRB1∗01, -DRB1∗10 (DR1 supertypes), -DRB1∗15, and the -DRB1∗16 (DR51 supertypes) allelic lineages, which occurred after their separation from the DR8, DR52 and DR53 supertypes (Kulski et al., 2010). On this basis, the AluDQA1 insertion must have happened much earlier than the AluDRB1 insertion during human evolution and population expansions. These results confirm that the AluDRB1 insertion probably originated in an ancestral HLA-DRB1 allele as a progenitor of the DR51 supertypes (Kulski et al., 2010), which contained HLA-DRB1∗15 and -DRB1∗16 (Andersson, 1998; Gibbons et al., 2004). AluDPB2 has a frequency range from 0.278 to 0.574 in fifteen populations, with low- to high-level percentage associations with many different HLA-DRB1 alleles (Supplementary Table 7). This greater number of associations between AluDPB2 and HLA-DRB1 than between AluDRB1 and HLA-DRB1 is probably because the AluDPB2 locus is 536 kb from the HLA-DRB1 locus with the likelihood of numerous ancient recombination events occurring in between the two loci (Supplementary Figure 1; Kulski et al., 2021). The AluORF10 had a strong association with HLA-DRB1∗15 only in Caucasians (89.1%). In contrast, the AluORF10 was associated strongly with HLA-DRB1∗16 in eight East Asian populations (Jingpo, Wa, Maonan, Zhuang, Tu, Yugur, Han-Yunnan, and Japanese); whereas HLA-DRB1∗16 was absent in the Caucasian population. This suggests at least one or more recombination events at an unidentified junction between the AluORF10 and HLA-DRB1 locations in the ancestral progenitors of the DR51 supertypes. Although this study focused on Alu and HLA-DRB1 evolutionary genetic markers and population structure and was not related directly to medical or health issues, it is noteworthy that the Alu indels could have enhancer and other regulatory roles that affect the expression of HLA class II genes and/or other genes in the MHC and elsewhere in the human genome (Hasler and Strub, 2006; Moolhuijzen et al., 2010; Spirito et al., 2019; Goubert et al., 2020; Kulski et al., 2021). Many Alu elements of the AluJ, AluS, and AluY subfamilies are transcriptionally active with highly expressed self-cleaving ribozyme activity during T-cell activation and thermal and endoplasmic reticulum stress (Hernandez et al., 2020). Furthermore, Wang et al. (2017) identified two Alu indels Alu-5072 and Alu-5075 in the class II region as potential enhancers for HLA-DRB5, and HLA-DQB1-AS1 associated with phenotypes of lymphoma, Hodgkin lymphoma and chronic hepatitis B infection, respectively (Wang et al., 2017). In this regard, Alu-5057 is probably the AluDRB1 indel at the 5′ end of HLA-DRB1 (Supplementary Figure 1). Thus, the question remains whether the other four Alu indels described in this study also have enhancer functions as Wang et al. (2017) reported for Alu-5072 and Alu-5075 (Wang et al., 2017). On the basis of these published findings, the transcriptional activity and role of Alu in the human MHC during epigenetic regulation needs to be investigated and better defined. Also, the Alu indels both as genotypes and haplotypes within the MHC could have important functions in cancer, autoimmunity and immunity to infections that have yet to be addressed and investigated. We used a set of five-locus POALINs from the MHC class I region as lineage markers in a previous study to determine the haplotypic association and differentiation of MHC class I polymorphic Alu insertions and HLA-B/Cw alleles in seven Chinese ethnic populations (Yao et al., 2010). The POALIN markers that we used in this study were limited to five loci in the MHC class II region, but were a sufficient number to effectively micro-differentiate between 15 populations. The advantages of these POALIN lineage markers within the MHC class I and class II regions are their applicability – they are well defined, cheap to prepare and administer in the laboratory, and they produce results that are reasonably easy to interpret. In future work, the more widely studied MHC class I 5-loci POALINs (Yao et al., 2010; Abeid et al., 2019; Kulski et al., 2019), other autosomal Alu loci (Antunez-de-Mayolo et al., 2002), and STR loci (Garcia-Obregon et al., 2011) could be included in the haplotype analyses to broaden the genetic distances and diversity between and within the various populations. In conclusion, the unique finding in this study, not previously reported, is that the MHC class II POALIN and HLA-DRB1 allele frequencies both grouped the 12 Chinese minority ethnic populations into their respective subfamilies and language families. When compared with the previously reported data of the Chinese Han in Yunnan, Japanese and Caucasians, it is evident that the POALINs in MHC class II, like the polymorphic class I and class II HLA genes, are informative genetic and haplotype markers, which can be used cheaply and simply in studies of population diversity, forensic medicine and disease research.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author Contributions

LiS and YY conceived and designed the research. YC, SL, JY, YT, and XZ performed the experiments. YC and JK analyzed the data. SL and YT collected the samples. YC, YY, LiS, and JK wrote and revised the manuscript. All authors read and approved the final version of the manuscript. All authors contributed to the article and approved the submitted version.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  78 in total

1.  Signatures of demographic history and natural selection in the human major histocompatibility complex Loci.

Authors:  Diogo Meyer; Richard M Single; Steven J Mack; Henry A Erlich; Glenys Thomson
Journal:  Genetics       Date:  2006-05-15       Impact factor: 4.562

2.  Estimation of average heterozygosity and genetic distance from a small number of individuals.

Authors:  M Nei
Journal:  Genetics       Date:  1978-07       Impact factor: 4.562

3.  Evolutionary relationship between human major histocompatibility complex HLA-DR haplotypes.

Authors:  A C Svensson; N Setterblad; U Pihlgren; L Rask; G Andersson
Journal:  Immunogenetics       Date:  1996       Impact factor: 2.846

4.  A new HLA map of Europe: Regional genetic variation and its implication for peopling history, disease-association studies and tissue transplantation.

Authors:  Alicia Sanchez-Mazas; Stéphane Buhler; José Manuel Nunes
Journal:  Hum Hered       Date:  2014-05-21       Impact factor: 0.444

5.  The association and differentiation of MHC class I polymorphic Alu insertions and HLA-B/Cw alleles in seven Chinese populations.

Authors:  Y Yao; L Shi; L Shi; J K Kulski; J Chen; S Liu; L Yu; K Lin; X Huang; Y Tao; K Tokunaga; J Chu
Journal:  Tissue Antigens       Date:  2010-09

Review 6.  Alu elements as regulators of gene expression.

Authors:  Julien Häsler; Katharina Strub
Journal:  Nucleic Acids Res       Date:  2006-10-04       Impact factor: 16.971

7.  Dominant sequences of human major histocompatibility complex conserved extended haplotypes from HLA-DQA2 to DAXX.

Authors:  Charles E Larsen; Dennis R Alford; Michael R Trautwein; Yanoh K Jalloh; Jennifer L Tarnacki; Sushruta K Kunnenkeri; Dolores A Fici; Edmond J Yunis; Zuheir L Awdeh; Chester A Alper
Journal:  PLoS Genet       Date:  2014-10-09       Impact factor: 5.917

8.  Human Retrotransposon Insertion Polymorphisms Are Associated with Health and Disease via Gene Regulatory Phenotypes.

Authors:  Lu Wang; Emily T Norris; I K Jordan
Journal:  Front Microbiol       Date:  2017-08-02       Impact factor: 5.640

9.  Sequences of 95 human MHC haplotypes reveal extreme coding variation in genes other than highly polymorphic HLA class I and II.

Authors:  Paul J Norman; Steven J Norberg; Lisbeth A Guethlein; Neda Nemat-Gorgani; Thomas Royce; Emily E Wroblewski; Tamsen Dunn; Tobias Mann; Claudia Alicata; Jill A Hollenbach; Weihua Chang; Melissa Shults Won; Kevin L Gunderson; Laurent Abi-Rached; Mostafa Ronaghi; Peter Parham
Journal:  Genome Res       Date:  2017-03-30       Impact factor: 9.043

10.  Impact of polymorphic transposable elements on transcription in lymphoblastoid cell lines from public data.

Authors:  Giovanni Spirito; Damiano Mangoni; Remo Sanges; Stefano Gustincich
Journal:  BMC Bioinformatics       Date:  2019-11-22       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.