Literature DB >> 32425978

High-Resolution HLA Typing of HLA-A, -B, -C, -DRB1, and -DQB1 in Kinh Vietnamese by Using Next-Generation Sequencing.

Minh Duc Do1, Linh Gia Hoang Le1, Vinh The Nguyen1, Tran Ngoc Dang2, Nghia Hoai Nguyen1, Hoang Anh Vu1, Thao Phuong Mai3.   

Abstract

Human leukocyte antigen (HLA) genotyping displays the particular characteristics of HLA alleles and haplotype frequencies in each population. Although it is considered the current gold standard for HLA typing, high-resolution sequence-based HLA typing is currently unavailable in Kinh Vietnamese populations. In this study, high-resolution sequence-based HLA typing (3-field) was performed using an amplicon-based next-generation sequencing platform to identify the HLA-A, -B, -C, -DRB1, and -DQB1 alleles of 101 unrelated healthy Kinh Vietnamese individuals from southern Vietnam. A total of 28 HLA-A, 41 HLA-B, 21 HLA-C, 26 HLA-DRB1, and 25 HLA-DQB1 alleles were identified. The most frequently occurring HLA alleles were A∗11:01:01, B∗15:02:01, C∗07:02:01, DRB1∗12:02:01, and DQB1∗03:01:01. Haplotype calculation showed that A∗29:01:01∼B∗07:05:01, DRB1∗12:02:01∼DQB1∗3:01:01, A∗29:01:01∼C∗15:05:02∼B∗07:05:01, A∗33:03:01∼B∗58:01:01∼DRB1∗03:01:01, and A∗29:01:01∼C∗15:05:02∼B∗07:05:01∼DRB1∗10:01:01∼DQB1∗05:01:01 were the most common haplotypes in the southern Kinh Vietnamese population. Allele distribution and haplotype analyses demonstrated that the Vietnamese population shares HLA features with South-East Asians but retains unique characteristics. Data from this study will be potentially applicable in medicine and anthropology.
Copyright © 2020 Do, Le, Nguyen, Dang, Nguyen, Vu and Mai.

Entities:  

Keywords:  HLA typing; Kinh Vietnamese; allele frequency; haplotype frequency; high-resolution; next-generation sequencing

Year:  2020        PMID: 32425978      PMCID: PMC7204072          DOI: 10.3389/fgene.2020.00383

Source DB:  PubMed          Journal:  Front Genet        ISSN: 1664-8021            Impact factor:   4.599


Introduction

Human leukocyte antigen (HLA) genes, which encode major histocompatibility complex proteins in humans, are located in the short arm of chromosome 6 (Alper et al., 2006). These encoded HLA proteins are displayed on the cell surface and can be classified into two distinct classes. Class I HLA proteins (A, B, and C) present intracellular antigens originating from viruses or tumors to cytotoxic T lymphocytes. Class II HLA proteins (DR, DQ, and DP) present extracellular antigens to T-helper cells. HLA genes are highly polymorphic and play an important role in immune-mediated diseases, tumor-development processes, transplanted organ or tissue survival determination, and drug hypersensitivity (Dawson et al., 2001; Dhaliwal et al., 2003; Hung et al., 2005; Avila-Rios et al., 2009; Chen et al., 2015; Thao et al., 2018). HLA genotyping is a complex procedure due to the extreme degree of polymorphism in the major histocompatibility complex family. The most polymorphic regions, known as the core exons, are exons 2 and 3 in HLA class I genes and exon 2 in HLA class II genes. The sequences of the core exons are the most popular targets for genotyping as they are believed to be essential determinants of antigen specificity, which is informative for transplantation. However, in population genetic and evolutionary studies, many polymorphisms in other exons, introns, and UTRs have been identified and contribute to creating HLA nomenclature (Marsh and WHO Nomenclature Committee for Factors of the Hla System, 2012). Currently, HLA typing is performed using DNA-based methods, including SSP- (sequence-specific primer), SSO- (sequence-specific oligonucleotide), and RFLP-PCR (restriction fragment length polymorphism polymerase chain reaction) and sequence-based typing (SBT) (Tait et al., 2009; Bontadini, 2012; Erlich, 2012). SBT was considered the gold-standard method for high-resolution HLA genotyping, although this technique may produce uncertain results due to insufficient sequencing and ambiguous haplotype phasing (Erlich, 2012). Recent advancements in next-generation sequencing (NGS) technologies have significantly impacted the HLA-typing process (Abbott et al., 2006; Bentley et al., 2009; Erlich et al., 2011; Erlich, 2012; Shiina et al., 2012; Hosomichi et al., 2013, 2015; Schöfl et al., 2017). These new approaches can overcome the usual phase ambiguity of HLA alleles and enable massive, parallel, high-resolution HLA-typing. Different NGS-based HLA-typing methods have been established, such as amplicon-based HLA sequencing (Boegel et al., 2012; Shiina et al., 2012; Hosomichi et al., 2013; Schöfl et al., 2017), target enrichment of HLA genes (Wittig et al., 2015), and whole exome or genome sequencing data-derived typing (Liu et al., 2012; Major et al., 2013). Only a few studies have been performed to analyze HLA allele and haplotype frequency in the Vietnamese population (Vu-Trieu et al., 1997; Busson et al., 2002; Hoa et al., 2008). Moreover, these studies failed to present detailed HLA information due to low-resolution or incomplete loci description. There is an urgent need for an HLA-typing procedure that can yield accurate and detailed HLA allele distribution. Previous studies have investigated HLA allele distribution among the Kinh population in northern Vietnam, but this study aimed to perform high-resolution HLA typing (3-field) via NGS and determine the frequency of specific alleles and haplotypes of HLA-A, -B, -C, -DRB1, and -DQB1 in southern Kinh Vietnamese populations.

Materials and Methods

Subjects

A descriptive, cross-sectional study was conducted involving 101 unrelated healthy individuals. All subjects, who originated from Ho Chi Minh City and the surrounding Mekong delta provinces, were self-identified as Kinh Vietnamese and were recruited at the University of Medicine and Pharmacy, Ho Chi Minh City, Vietnam from August to October 2017. The study was approved by the Ethics Committee of the University of Medicine and Pharmacy at Ho Chi Minh City, Vietnam. All subjects were counseled and provided written informed consent for the study.

DNA Extraction

Venous blood (2 ml) was collected from each subject using an EDTA anticoagulant tube. Genomic DNA was extracted from peripheral blood leukocytes using the QIAamp DNA Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer’s protocol, and samples were stored at −20°C until analysis. Genomic DNA quality was assessed by measuring absorbance at 260 nm using a NanoDrop 2000 (Thermo Scientific, MA, United States), and the optical density (OD) ratio (260/280 nm) was calculated to evaluate sample purity. The recommended purified genomic DNA concentration (≥30 μg/μL) and OD ratio (≥1.8) for library preparation were ascertained.

Library Preparation

The HLA TruSight kit (CareDx, Brisbane, CA, United States) was used for library preparation. Library construction began with a long-range PCR for full-length HLA-A, -B, -C, -DRB1, -DQB1 loci. All amplicons were normalized to prevent sequencing bias between samples by using magnetic beads consisting of carboxy-coated paramagnetic particles (Hawkins et al., 1994). The beads bound saturating amounts of DNA, and the DNA concentration was normalized to a similar concentration across samples after the washing and elution steps (Hosomichi et al., 2014). Subsequently, the DNA amplicons were fragmented into approximately 2-kb pieces, indexed, and pooled for sequencing on the MiniSeq platform (Illumina, San Diego, CA, United States). The pooled library was quantitated before loading on MiniSeq as the library concentration determines cluster density, which is an important parameter for data quality. As instructed in the Illumina protocol, a Qubit 3.0 fluorometer (Thermo Scientific, Waltham, MA, United States) was used for library quantitation. The pooled library was loaded unto the MiniSeq system when its concentration was ≥10 ng/μL.

Sequencing

Next-generation sequencing was performed via the MiniSeq system. Each sample was examined for average depth of coverage and Q30 quality scores, which were >200 and 85, respectively, for all five loci. The sequences were subsequently analyzed using an Assign TruSight HLA v2.0 (CareDx, Brisbane, CA, United States).

HLA Assigned by Assign TruSight HLA v2.0

Qualified FASTQ files from the MiniSeq system were analyzed by Assign TruSight HLA v2.0 (CareDx, Brisbane, CA, United States). Results with 0 core exon mismatch and phasing ≤2 were accepted. Although full-length HLA loci were sequenced, the maximum resolution that the software Assign TruSight HLA v2.0 can provide is 3-field. Higher resolution (4-field) can be achieved if other analysis tools are applied to assign HLA alleles.

Statistical Analysis

For single-locus analysis, allele frequencies were calculated by direct counting, deviation from Hardy–Weinberg (HW) proportions was calculated via chi-square test, and the Ewens–Watterson (EW) homozygosity test of neutrality was also performed via Monte-Carlo implementation of the exact test (Ewens, 1972; Watterson, 1978; Slatkin, 1996). The calculation was executed in PyPop: Python for Population Genomics (Lancaster et al., 2007). For multiple-locus analysis, haplotype frequencies were estimated using an expectation-maximization algorithm by Arlequin ver. 3.5 with default settings (Excoffier and Lischer, 2010); linkage disequilibrium (LD) between all HLA allele pairs was analyzed in PyPop, in which D′ and Wn of specific allele pairs were calculated (Lancaster et al., 2007). LD between all HLA loci pairs was further calculated and plotted using conditional asymmetric linkage disequilibrium (ALD) measures (Thomson and Single, 2014). The principal component analysis (PCA) of HLA-A, -B, and -DRB1 was performed using Excel 2010 to compare allele distribution between our data (n = 101) and HLA allele frequency data of the Vietnamese Hanoi Kinh population 2 (n = 170), Chinese Canton Han population (n = 264), Indonesian Sundanese and Javanese population (n = 201), Thai population (n = 142), Japanese population 3 (n = 1018), South Korean population 3 (n = 485), and Malaysian Peninsular Malay population (n = 951), which were retrieved from the Allele Frequencies Net Database (allelefrequencies.net) (González-Galarza et al., 2015). Due to the unavailability of 3-field HLA data in previous studies, we converted 3-field to 2-field data. For example, HLA-A∗24:02:01, A∗24:02:13, and A∗24:02:40 were converted to HLA-A∗24:02 with a frequency (0.13861) that was the sum of the three 3-field alleles (0.12871, 0.00495, and 0.00495, respectively). PCA results were plotted using BioVinci software (BioTuring Inc., San Diego, CA, United States).

Results

Advancements in NGS offer the ability to distinguish between a set of alleles that share two field names and differ in the third field, such as A∗24:02, C∗07:01, and DQB1∗05:02, in one sequencing batch. As the polymorphisms of A∗24:02:40, A∗24:02:13, C∗07:01:02, and DQB1∗05:02:02 are not in the core exons, several traditional PCR and sequencing reactions were required to determine these alleles before NGS methods became available.

Allele Frequencies

The number of HLA-A, HLA-B, HLA-C, HLA-DRB1, and HLA-DQB1 alleles detected in this study were 28, 41, 21, 26, and 25, respectively. The frequencies of HLA class I and class II alleles are summarized in Table 1. HLA-A∗11:01:01, A∗24:02:01, and A∗33:03:01 (22.77, 12.87, and 10.89%) were the three most frequent HLA-A alleles, followed by A∗02:07:01, A∗29:01:01, and A∗02:03:01 (9.90, 8.42, and 7.43%, respectively). HLA-B∗15:02:01, B∗46:01:01, B∗58:01:01, B∗40:01:02, B∗38:02:01, and B∗07:05:01 (11.88, 9.41, 8.42, 7.92, 7.92, and 6.93%, respectively) were the most frequent HLA-B alleles. The most frequent alleles in locus C were HLA-C∗07:02:01, C∗01:02:01, and C∗08:01:01 (21.78, 13.37, and 12.87%). HLA-DRB1∗12:02:01 accounted for 22.28% of the HLA-DRB1 alleles. HLA-DRB1∗09:01:02 was the second most frequent allele (13.37%), followed by DRB1∗15:02:01, DRB1∗10:01:01, DRB1∗03:01:01, and DRB1∗04:05:01 (9.90, 7.92, 7.42, 6.44%, respectively). On the HLA-DQB1 locus, DQB1∗03:01:01 was the most frequent allele (28.71%), followed by DQB1∗03:03:02, DQB1∗05:01:01, and DQB1∗05:02:01 (12.87, 10.89, and 9.90%, respectively).
TABLE 1

HLA frequency in the Kinh population (n = 101) (AF: allele frequency).

ACountAFCCountAFBCountAFDRB1CountAFDQB1CountAF
01:01:0130.0148501:02:01270.1336607:02:0140.0198003:01:01150.0742602:01:01140.06931
02:01:0160.0297003:02:02180.0891107:05:01140.0693104:01:0110.0049502:02:0140.01980
02:03:01150.0742603:03:0190.0445508:01:0110.0049504:03:0130.0148503:01:01580.28713
02:03:0210.0049503:04:01150.0742613:01:0160.0297004:05:01130.0643603:02:0150.02475
02:06:0160.0297003:04:0210.0049513:02:0120.0099004:06:0120.0099003:03:02260.12871
02:07:01200.0990103:1710.0049515:01:0120.0099007:01:0160.0297003:03:0510.00495
03:01:0110.0049504:01:01100.0495015:02:01240.1188108:03:02110.0544503:05:0210.00495
03:02:0120.0099004:03:01150.0742615:11:0110.0049508:1210.0049504:01:01100.04950
11:01:01460.2277204:8210.0049515:1230.0148509:01:02270.1336604:02:0120.00990
11:02:0150.0247506:02:0140.0198015:13:0110.0049510:01:01160.0792105:01:01220.10891
11:0420.0099007:01:0110.0049515:17:0110.0049511:01:0150.0247505:01:0320.00990
24:02:01260.1287107:01:0210.0049515:25:01110.0544611:06:0130.0148505:01:1210.00495
24:02:1310.0049507:02:01440.2178215:27:0110.0049511:12910.0049505:02:01200.09901
24:02:4010.0049507:04:0120.0099015:3520.0099012:02:01450.2227705:02:0220.00990
24:03:0120.0099007:0610.0049518:01:0120.0099013:01:0110.0049505:02:0410.00495
24:07:0160.0297008:01:01260.1287118:0210.0049513:02:0130.0148505:03:0150.02475
24:10:0110.0049508:03:0120.0099027:0640.0198013:12:0160.0297005:03:0210.00495
24:2030.0148512:02:0230.0148535:01:0140.0198014:04:0110.0049505:03:1110.00495
26:01:0140.0198014:02:0140.0198035:03:0110.0049514:05:0120.0099005:1010.00495
29:01:01170.0841615:02:0130.0148535:05:0170.0346514:1010.0049505:1820.00990
30:01:0110.0049515:05:02140.0693137:01:0110.0049514:1810.0049506:01:01170.08416
31:01:0230.0148538:02:01160.0792114:54:0130.0148506:02:0120.00990
32:01:0110.00495Total2021.0000039:01:0140.0198015:01:0150.0247506:03:0110.00495
33:01:0120.0099039:09:0110.0049515:02:01200.0990106:04:0110.00495
33:03:01220.1089140:01:02160.0792115:02:0210.0049506:09:0120.00990
34:01:0130.0148540:02:0110.0049516:02:0190.04455
68:01:0210.0049540:06:0140.01980Total2021.00000
74:02:0110.0049544:03:0220.00990Total2021.00000
46:01:01190.09406
Total2021.000048:01:0130.01485
51:01:0140.01980
51:02:0130.01485
51:06:0110.00495
52:01:0140.01980
54:01:0130.01485
55:02:0140.01980
55:1810.00495
56:01:0130.01485
56:0420.00990
57:01:0110.00495
58:01:01170.08416
Total2021.00000
HLA frequency in the Kinh population (n = 101) (AF: allele frequency). No tested loci showed any significant departure from the Hardy–Weinberg equilibrium; p-values for all homozygotes and all heterozygotes tests were 0.79 & 0.93, 0.73 & 0.93, 0.33 & 0.73, 0.68 & 0.89, and 0.40 & 0.74 for HLA- A, -B, -C, -DRB1, and -DQB1 loci, respectively. The results of the EW homozygosity test of neutrality are summarized in Table 2. p-values of F were 0.64, 0.37, 0.22, 0.44, and 0.76 for HLA- A, -B, -C, -DRB1, and -DQB1 loci, respectively.
TABLE 2

Results of the Ewens–Watterson homozygosity test of neutrality.

LocusNumber of allelesFobsFexpp-value
A280.10780.10550.6404
B410.05760.06500.3709
C210.11170.14830.2166
DRB1260.10240.11540.4389
DQB1250.13740.12100.7593
Results of the Ewens–Watterson homozygosity test of neutrality.

Haplotype Frequencies

Tables 3, 4, and 5 list the 20 most common two-locus, three-locus, and five-locus haplotypes. The most frequent haplotypes in the two-locus sets were A∗29:01:01∼B∗07:05:01 (6.93%), A∗33:03:01∼B∗58:01:01 (6.43%), A∗11:01:01∼B∗15:02:01 (5.87%), and DRB1∗12:02:01 ∼DQB1∗03:01:01 (21.28%), DRB1∗09:01:02∼DQB1∗03:03:02 (11.88%), DRB1∗10:01:01∼DQB1∗05:01:01 (7.42%). The two most frequent haplotypes in each three-locus set were A∗29:01:01 ∼C∗15:05:02∼B∗07:05:01 (6.93%) and A∗33:03:01∼B∗58:01:01 ∼DRB1∗03:01:01 (4.95%). The three most frequent five-locus haplotypes were A∗29:01:01∼C∗15:05:02∼B∗07:05:01∼DRB1∗ 10:01:01∼DQB1∗05:01:01 (4.46%), A∗33:03:01∼C∗03:02:02 ∼B∗58:01:01∼DRB1∗03:01:01∼DQB1∗02:01:01 (4.46%), and A∗11:01:01∼C∗08:01:01∼B∗15:02:01∼DRB1∗12:02:01∼DQB1∗ 03:01:01 (3.84%). The likelihood ratio test of linkage disequilibrium demonstrated that all two-, three- and five-locus associations were statistically significant (p < 0.001). Data on the full two-locus, three-locus, five-locus, and ten-locus haplotype frequencies are described in Supplementary Tables 1, 2, 3, and 4.
TABLE 3

Haplotype frequencies of two-locus HLA.

ABEst. counthap.freqDRB1DQB1Est. counthap.freq
29:01:0107:05:0114.000.0693112:02:0103:01:0143.000.21283
33:03:0158:01:0113.000.0643609:01:0203:03:0224.000.11881
11:01:0115:02:0111.860.0586910:01:0105:01:0115.000.07426
02:07:0146:01:0110.400.0514603:01:0102:01:0114.000.06931
02:03:0138:02:017.920.0392208:03:0206:01:0111.000.05446
11:01:0140:01:027.700.0381004:05:0104:01:0110.000.04950
11:01:0138:02:014.690.0232216:02:0105:02:019.000.04455
02:07:0115:02:014.150.0205213:12:0103:01:016.000.02970
24:02:0127:06:004.000.0198015:02:0105:01:016.000.02970
24:07:0135:05:014.000.0198015:02:0105:02:015.990.02966
11:01:0113:01:013.820.0189107:01:0102:02:014.000.01980
11:01:0115:25:013.610.0178911:01:0103:01:014.000.01980
24:02:0146:01:013.280.0162404:03:0103:02:013.000.01485
11:01:0139:01:013.000.0148504:05:0104:02:012.000.00990
24:02:0115:02:013.000.0148504:06:0103:02:012.000.00990
24:02:0115:25:012.390.0118111:06:0103:01:012.000.00990
11:01:0146:01:012.330.0115113:02:0106:09:012.000.00990
24:02:0140:01:022.230.0110214:05:0105:03:012.000.00990
02:01:0135:01:012.000.0099014:54:0105:02:012.000.00990
02:01:0140:01:022.000.0099015:01:0106:01:012.000.00990
TABLE 4

Haplotype frequencies of three-locus HLA.

ACBEst. counthap.freqABDRB1Est. counthap.freq
29:01:0115:05:0207:05:0114.000.0693133:03:0158:01:0103:01:0110.000.04950
33:03:0103:02:0258:01:0113.000.0643629:01:0107:05:0110:01:019.000.04455
11:01:0108:01:0115:02:0110.840.0536711:01:0115:02:0112:02:018.430.04175
02:07:0101:02:0146:01:0110.380.0513802:07:0146:01:0109:01:025.830.02886
02:03:0107:02:0138:02:018.000.0396002:03:0138:02:0108:03:024.000.01980
11:01:0104:03:0115:25:015.000.0247511:01:0140:01:0212:02:014.000.01980
11:01:0107:02:0138:02:015.000.0247524:02:0127:06:0012:02:014.000.01980
11:01:0107:02:0140:01:025.000.0247529:01:0107:05:0109:01:024.000.01980
02:07:0108:01:0115:02:014.160.0205902:07:0115:02:0112:02:013.170.01569
24:07:0104:01:0135:05:014.000.0198011:01:0146:01:0109:01:023.170.01569
11:01:0103:04:0113:01:013.820.0189124:02:0115:25:0115:02:013.000.01485
24:02:0101:02:0146:01:013.280.0162524:02:0135:05:0108:03:023.000.01485
24:02:0104:03:0115:25:013.000.0148524:02:0115:02:0112:02:012.400.01187
24:02:0103:04:0127:06:003.000.0148502:01:0135:01:0107:01:012.000.00990
11:01:0101:02:0146:01:012.340.0115802:01:0140:01:0209:01:022.000.00990
02:01:0103:03:0135:01:012.000.0099002:03:0138:02:0116:02:012.000.00990
11:01:0107:02:0139:01:012.000.0099002:07:0146:01:0111:01:012.000.00990
11:02:0107:02:0140:01:022.000.0099002:07:0146:01:0112:02:012.000.00990
24:02:0108:01:0115:02:012.000.0099011:01:0107:02:0110:01:012.000.00990
24:02:0104:01:0135:05:012.000.0099011:01:0115:02:0107:01:012.000.00990
TABLE 5

Haplotype frequencies of five-locus HLA.

ACBDRB1DQB1Est. counthap.freq
29:01:0115:05:0207:05:0110:01:0105:01:019.000.04455
33:03:0103:02:0258:01:0103:01:0102:01:019.000.04455
11:01:0108:01:0115:02:0112:02:0103:01:017.760.03844
02:07:0101:02:0146:01:0109:01:0203:03:025.760.02854
11:01:0101:02:0146:01:0109:01:0203:03:024.230.02096
02:03:0107:02:0138:02:0108:03:0206:01:014.000.01980
02:07:0108:01:0115:02:0112:02:0103:01:013.230.01601
11:01:0107:02:0140:01:0212:02:0103:01:013.000.01485
24:07:0104:01:0135:05:0112:02:0103:01:013.000.01485
29:01:0115:05:0207:05:0109:01:0203:03:023.000.01485
02:01:0103:03:0135:01:0107:01:0102:02:012.000.00990
02:03:0107:02:0138:02:0112:02:0105:02:012.000.00990
02:07:0108:01:0115:02:0113:12:0103:01:012.000.00990
02:07:0101:02:0146:01:0112:02:0103:01:012.000.00990
11:01:0107:02:0107:02:0110:01:0105:01:012.000.00990
11:01:0108:01:0115:02:0107:01:0102:02:012.000.00990
11:01:0107:02:0115:25:0115:02:0103:01:012.000.00990
11:01:0107:02:0138:02:0115:02:0105:01:012.000.00990
11:01:0107:02:0139:01:0114:54:0103:01:012.000.00990
11:01:0103:02:0258:01:0103:01:0102:01:012.000.00990
Haplotype frequencies of two-locus HLA. Haplotype frequencies of three-locus HLA. Haplotype frequencies of five-locus HLA.

Population Genetic Analysis

Pairwise LD estimates are given in Table 6 with D′ and Wn. The LD of allele pairs was always statistically significant with 1,000 permutations. LD plots based on ALD measures for HLA loci are shown in Figure 1. Generally, the associations between HLA loci within HLA classes were stronger than between HLA loci in different classes, except for the case of B & DRB1 loci. Both symmetric and asymmetric LD showed that the strongest genetic linkages were between C & B loci and DRB1 & DQB1 loci.
TABLE 6

Pairwise linkage disequilibrium estimates.

Locus pairD′Wn#permutationsp-value
A:C0.685220.566869990.0000
A:B0.769230.619719990.0000
A:DRB10.634410.516139990.0000
A:DQB10.633350.489739990.0000
C:B0.919950.845079990.0000
C:DRB10.668700.510549990.0000
C:DQB10.658890.482809990.0000
B:DRB10.798140.586969990.0000
B:DQB10.773520.611279990.0000
DRB1:DQB10.931760.709969990.0000
FIGURE 1

LD plot based on asymmetric linkage disequilibrium (ALD) measures for HLA genes.

Pairwise linkage disequilibrium estimates. LD plot based on asymmetric linkage disequilibrium (ALD) measures for HLA genes. The PCA plot of eight Asian populations is shown in Figure 2. The percentage of variability represented by the first three principal components was 82.08%. The first, second, and third principal components demonstrated 47.29, 20.72, and 14.07% of the variances in allele frequencies between populations, respectively. The first principal component distinguished between the South-East Asian, Han Chinese, and East Asian (Japanese and South Korean) populations. The second principal component separated the Han Chinese, Kinh Vietnamese, and Thai from the Indonesian and Malaysian populations. The third principal component distinguished the Kinh Vietnamese from the Han Chinese and other South-East Asian populations. A homogeneous allele frequency distribution of HLA-A, -B, and -DRB1 was observed between the northern and southern Kinh Vietnamese (Hoa et al., 2008). Japanese and South Korean also presented a similar distribution of HLA alleles.
FIGURE 2

Principal component analysis (PCA) plot of eight populations based on HLA-A, -B, and -DRB1 allele frequencies. PC1, principal component 1; PC2, principal component 2; PC3, principal component 3.

Principal component analysis (PCA) plot of eight populations based on HLA-A, -B, and -DRB1 allele frequencies. PC1, principal component 1; PC2, principal component 2; PC3, principal component 3.

Discussion

In recent years, various HLA-typing methods using different NGS approaches have been performed. NGS-based HLA typing can provide high-resolution, unambiguous, phase-defined HLA alleles, avoiding several limitations compared to traditional sequence-based typing methods (Carapito et al., 2016). Our study showed the distribution of HLA-A, -B, -C, -DRB1, and -DQB1 alleles and haplotypes among the southern Kinh Vietnamese population using high-resolution NGS typing (reported at 3-field resolution, which remains ambiguous in many cases). Highly polymorphic sequences at both HLA class I and class II loci resulted in 28 alleles for HLA-A, 41 alleles for HLA-B, 21 alleles for HLA-C, 26 alleles for HLA-DRB1, and 25 alleles for HLA-DQB1. The most frequent HLA-A alleles found in this study were A∗11:01:01 and A∗24:02:01. The high frequency of HLA-A∗11:01 and A∗24:02:01 is consistent with previous typing results of northern Kinh Vietnamese and other Asian populations, such as the Chinese, Thai, Indonesian, Korean, and Japanese (Lee et al., 2005; Hoa et al., 2008; Yuliwulandari et al., 2009; Shen et al., 2014; Ikeda et al., 2015; Nakkam et al., 2018). Among HLA-C alleles identified in this study, C∗07:02:01 was found to be widely distributed globally, while C∗01:02:01 was common in Asians (Lee et al., 2005; Shen et al., 2014; Ikeda et al., 2015; Nakkam et al., 2018). The predominance of HLA-B∗15 alleles is a major distinguishing characteristic of the Kinh population from the Thai and Chinese groups (Shen et al., 2014; Nakkam et al., 2018). However, this predominance is similar in the Indonesian population (Yuliwulandari et al., 2009). Detailed comparison of B∗15 alleles among the Vietnamese and Indonesians showed similar popularity of B∗15:02, while the second most-frequent B∗15 alleles were B∗15:25:01 and B∗15:13, respectively. HLA-B∗07:05:01, the only B∗07 allele found in Kinh Vietnamese, was the sixth most-frequent HLA-B allele, whereas it is a minor allele in other Asian groups (Whang et al., 2001). At the HLA-DRB1 locus, the most frequent allele was HLA-DRB1∗12:02:01 (22.28%), which is common among South-East Asian populations (Busson et al., 2002; Hoa et al., 2008; Yuliwulandari et al., 2009; Nakkam et al., 2018) but infrequent among Northern East Asian groups, including Japanese and Koreans (Lee et al., 2005; Ikeda et al., 2015). Another similarity observed between the Kinh Vietnamese, Muong Vietnamese, and other South-East Asians is the predominance of HLA-DRB1∗15:02:01 over HLA-DRB1∗15:01:01, in contrast to what was observed among Northern East Asian populations. The first and second-most predominance of HLA-DQB1∗03:01:01 (28.71%) and DQB1∗03:03:02 (12.38%) in Kinh Vietnamese is similar among East Asian populations, including Taiwanese, Chinese, Korean, and Japanese (Saito et al., 2000; Lee et al., 2005; Yang and Chen, 2017), while the third-most predominance of HLA-DQB1∗05:02:01 (9.90%) is closer to the characteristics of the Thai population (Romphruk et al., 1999). In Kinh Vietnamese, the predominance of DQB1∗05:01 over DQB1∗05:02 in our data was consistent with data from a previous study (Hoa et al., 2008). However, Muong Vietnamese showed a contrary distribution (48%) of DQB1∗05:02 (Busson et al., 2002). Based on the haplotype calculation, most two-, three-, and five-locus HLA haplotypes with predominant frequencies were consistent with a previous report on northern Kinh Vietnamese (Hoa et al., 2008). Despite being the sixth most common HLA-B allele, B∗07:05:01 was strongly associated with A∗29:01:01 and lead to the common signature haplotypes of the Kinh population, including A∗29:01:01∼B∗07:05:01, A∗29:01:01∼C∗15:05:02∼B∗07:05:01, and A∗29:01:01∼B∗07:05:01∼DRB1∗10:01:01. Interestingly, A∗29:01:01∼C∗15:05:02∼B∗07:05:01∼DRB1∗10:01:01∼DQB1∗ 05:01:01 was the most common five-locus haplotype (4.45%). The predominance of these haplotypes might be a unique feature of the Kinh Vietnamese. The strong association of DRB1∗12:02:01 and DQB1∗03:01:01 in HLA class II found in our study is also well-described in Thai, Indonesian, and surrounding populations (Gao et al., 1992; Romphruk et al., 1999; Mack et al., 2000). The strong associations between all pairs of HLA loci in southern Kinh Vietnamese indicate a low probability of recombination between alleles from these loci; therefore, individuals who carry allele haplotypes in LD are more likely to find a donor with matching haplotypes. The strong LD between class I HLA loci has also been well-described in Asian populations (Shen et al., 2014; Ikeda et al., 2015), while the nearly complete LD of DRB1 and DQB1 loci has been observed in Han Chinese (Trachtenberg et al., 2007). PCA showed a homogeneous HLA-A, -B, and -DRB1 allele distribution of northern and southern Kinh Vietnamese. The allele distribution also demonstrated a closer relationship between Kinh Vietnamese and other South-East Asian groups than with the Han Chinese group. The Japanese were closely grouped with South Koreans, reflecting the similarity in HLA distribution among East Asian populations. Previously, HLA typing of Asian populations were mainly based on SSO-PCR (Lee et al., 2005; Yuliwulandari et al., 2009; Shen et al., 2014; Ikeda et al., 2015; Nakkam et al., 2018). Due to the finite amounts of probes designed to recognize the polymorphisms in the core exons, this technique only allows certain allele typing with 2-field resolution. Alleles were then assigned by software based on SSO-PCR patterns. Hence, the number of alleles determined by SSO-PCR is limited. With full-length HLA sequences provided by NGS, HLA-typing software programs align sequence reads to the entire IMGT/HLA Database to find the best-matching alleles. NGS-based typing, therefore, can provide diversified HLA assignments. In our study, the number of identified alleles (141 alleles) in 101 subjects was higher compared to the previous study in northern Kinh Vietnamese (115 identified alleles in 170 subjects) (Hoa et al., 2008). Similar results were obtained in the Thai population, in which the number of HLA alleles determined by NGS and SSO-PCR were 156 and 144, respectively (Geretz et al., 2018; Nakkam et al., 2018). Recently, it has been shown that both high-resolution HLA typing and haplotyping are important in hematopoietic stem cell transplantation for both unrelated and related donors in reducing post-transplantation adverse outcomes (Agarwal et al., 2017; Buhler et al., 2019); a single high-resolution HLA mismatch may lead to a similar negative effect on outcomes as a low-resolution one (Fuji et al., 2015; Armstrong et al., 2017). Therefore, it has been suggested that high-resolution HLA typing can reduce the likelihood of missing a clinically significant mismatch compared to traditional low-resolution typing, especially in developing countries where high-resolution HLA typing methods are not widely available (Agarwal et al., 2017). With a 3-field resolution, our typing process can distinguish between HLA-A∗24:02:01, HLA-A∗24:02:13, and HLA-A∗24:02:40 and between HLA-C∗07:01:01 and HLA-C∗07:01:02, which are considered high-resolution mismatches. Although traditional SBT can separate these alleles, it is time and resource-consuming. Our study had several limitations that should be considered in interpreting the results. First of all, the absence of other class II HLA descriptions (HLA-DQA1, -DPA1, and -DPB1) makes the study less informative, especially for population genetic purposes. Second, the study sample size was relatively small. This may increase the risk of missing rare HLA alleles in Kinh Vietnamese and reduce the significance of statistical analysis. These limitations will necessitate further studies with comprehensive allele descriptions and larger sample sizes. It is now also well-recognized that HLA molecules are strongly associated with the pathophysiology of adverse drug reactions, including severe cutaneous adverse reaction (SCAR), agranulocytosis, and liver injury. High prevalence of HLA-B∗15:02, B∗58:01, B∗38:02, DRB1∗08:03, and C∗03:02 suggests that the Kinh Vietnamese population is at a high risk of developing carbamazepine-induced SCAR, allopurinol-induced SCAR, methimazole-induced agranulocytosis, and methimazole-induced liver injury, respectively (Hung et al., 2005; Chen et al., 2015; Thao et al., 2018; Li et al., 2019), while the risk of developing dapsone or abacavir-induced hypersensitivity is low due to the low prevalence of HLA-B∗13:01 and B∗57:01 (Mallal et al., 2008; Sousa-Pinto et al., 2015; Tempark et al., 2017). Therefore, HLA information is important to clinicians for treatment modality adoption and to healthcare policymakers for constructing personalized medicine strategies.

Conclusion

To our knowledge, this is the first report of high-resolution HLA-A, -B, -C, -DRB1, and -DQB1 allele and haplotype frequencies in southern Kinh Vietnamese individuals. These data display the homogenous distribution of HLA between the northern and southern Kinh population in Vietnam. Although the characteristics of HLA class I and II alleles and haplotypes in the Kinh Vietnamese are similar to those in the Thai, Malaysian, and Indonesian populations, they still retain unique characteristics. Data from this study will be useful in anthropology, immune-mediated diseases, transplantation therapy, and drug hypersensitivity.

Data Availability Statement

Raw data supporting the conclusions of this article are available on NCBI SRA with accession PRJNA609593. The data on HLA allele frequencies and haplotypes presented in this study are available on allelefrequencies.net with accession Vietnam Kinh (n = 101).

Ethics Statement

The studies involving human participants were reviewed and approved by The Ethics committee of University of Medicine and Pharmacy at Ho Chi Minh City, Vietnam. The patients/participants provided their written informed consent to participate in this study.

Author Contributions

TM and MD designed the study, wrote the manuscript. MD, LL, and VN performed the experiments. TD, HV, NN, MD, and TM analyzed the data.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  55 in total

1.  PyPop update--a software pipeline for large-scale multilocus population genomics.

Authors:  A K Lancaster; R M Single; O D Solberg; M P Nelson; G Thomson
Journal:  Tissue Antigens       Date:  2007-04

2.  Super high resolution for single molecule-sequence-based typing of classical HLA loci at the 8-digit level using next generation sequencers.

Authors:  T Shiina; S Suzuki; Y Ozaki; H Taira; E Kikkawa; A Shigenari; A Oka; T Umemura; S Joshita; O Takahashi; Y Hayashi; M Paumen; Y Katsuyama; S Mitsunaga; M Ota; J K Kulski; H Inoko
Journal:  Tissue Antigens       Date:  2012-08-04

3.  Development of a high-resolution NGS-based HLA-typing and analysis pipeline.

Authors:  Michael Wittig; Jarl A Anmarkrud; Jan C Kässens; Simon Koch; Michael Forster; Eva Ellinghaus; Johannes R Hov; Sascha Sauer; Manfred Schimmler; Malte Ziemann; Siegfried Görg; Frank Jacob; Tom H Karlsen; Andre Franke
Journal:  Nucleic Acids Res       Date:  2015-03-09       Impact factor: 16.971

4.  The sampling theory of selectively neutral alleles.

Authors:  W J Ewens
Journal:  Theor Popul Biol       Date:  1972-03       Impact factor: 1.570

5.  Allele frequencies and haplotypic associations defined by allelic DNA typing at HLA class I and class II loci in the Japanese population.

Authors:  S Saito; S Ota; E Yamada; H Inoko; M Ota
Journal:  Tissue Antigens       Date:  2000-12

Review 6.  Review article: Luminex technology for HLA antibody detection in organ transplantation.

Authors:  Brian D Tait; Fiona Hudson; Linda Cantwell; Gemma Brewin; Rhonda Holdsworth; Greg Bennett; Matthew Jose
Journal:  Nephrology (Carlton)       Date:  2009-04       Impact factor: 2.506

7.  Full-length next-generation sequencing of HLA class I and II genes in a cohort from Thailand.

Authors:  Aviva Geretz; Philip K Ehrenberg; Alain Bouckenooghe; Marcelo A Fernández Viña; Nelson L Michael; Danaya Chansinghakule; Kriengsak Limkittikul; Rasmi Thomas
Journal:  Hum Immunol       Date:  2018-09-19       Impact factor: 2.850

Review 8.  The impact of next-generation sequencing technologies on HLA research.

Authors:  Kazuyoshi Hosomichi; Takashi Shiina; Atsushi Tajima; Ituro Inoue
Journal:  J Hum Genet       Date:  2015-08-27       Impact factor: 3.172

9.  Genetic determinants of antithyroid drug-induced agranulocytosis by human leukocyte antigen genotyping and genome-wide association study.

Authors:  Pei-Lung Chen; Shyang-Rong Shih; Pei-Wen Wang; Ying-Chao Lin; Chen-Chung Chu; Jung-Hsin Lin; Szu-Chi Chen; Ching-Chung Chang; Tien-Shang Huang; Keh Sung Tsai; Fen-Yu Tseng; Chih-Yuan Wang; Jin-Ying Lu; Wei-Yih Chiu; Chien-Ching Chang; Yu-Hsuan Chen; Yuan-Tsong Chen; Cathy Shen-Jang Fann; Wei-Shiung Yang; Tien-Chun Chang
Journal:  Nat Commun       Date:  2015-07-07       Impact factor: 14.919

10.  A Bead-based Normalization for Uniform Sequencing depth (BeNUS) protocol for multi-samples sequencing exemplified by HLA-B.

Authors:  Kazuyoshi Hosomichi; Shigeki Mitsunaga; Hideki Nagasaki; Ituro Inoue
Journal:  BMC Genomics       Date:  2014-08-04       Impact factor: 3.969

View more
  8 in total

1.  Allele and Haplotype Frequencies of HLA-A, -B, -C, and -DRB1 Genes in 3,750 Cord Blood Units From a Kinh Vietnamese Population.

Authors:  Tran Ngoc Que; Nguyen Ba Khanh; Bach Quoc Khanh; Chu Van Son; Nguyen Thi Van Anh; Tran Thi Thuy Anh; Pham Dinh Tung; Nguyen Dinh Thang
Journal:  Front Immunol       Date:  2022-06-29       Impact factor: 8.786

Review 2.  Human Leukocyte Antigen (HLA) System: Genetics and Association with Bacterial and Viral Infections.

Authors:  Sadeep Medhasi; Narisara Chantratita
Journal:  J Immunol Res       Date:  2022-05-26       Impact factor: 4.493

3.  Association of ADIPOQ Single-Nucleotide Polymorphisms with the Two Clinical Phenotypes Type 2 Diabetes Mellitus and Metabolic Syndrome in a Kinh Vietnamese Population.

Authors:  Steven Truong; Nam Quang Tran; Phat Tung Ma; Chi Khanh Hoang; Bao Hoang Le; Thang Dinh; Luong Tran; Thang Viet Tran; Linh Hoang Gia Le; Hoang Anh Vu; Thao Phuong Mai; Minh Duc Do
Journal:  Diabetes Metab Syndr Obes       Date:  2022-02-03       Impact factor: 3.168

4.  HLA-DRB1 and DQB1 genetic susceptibility to pemphigus vulgaris and pemphigus foliaceus in Vietnamese patients.

Authors:  The Bich Thanh Vuong; Duc Minh Do; Phuc Thinh Ong; Thai Van Thanh Le
Journal:  Dermatol Reports       Date:  2021-08-05

5.  Genotype-phenotype characteristics of Vietnamese patients diagnosed with Charcot-Marie-Tooth disease.

Authors:  Trung-Hieu Nguyen-Le; Minh Duc Do; Linh Hoang Gia Le; Quynh Nhu Nguyen Nhat; Nghia Trong Tien Hoang; Tuan Van Le; Thao Phuong Mai
Journal:  Brain Behav       Date:  2022-08-08       Impact factor: 3.405

6.  Risk factors for cutaneous reactions to allopurinol in Kinh Vietnamese: results from a case-control study.

Authors:  Minh Duc Do; Thao Phuong Mai; Anh Duy Do; Quang Dinh Nguyen; Nghia Hieu Le; Linh Gia Hoang Le; Vu Anh Hoang; Anh Ngoc Le; Hung Quoc Le; Pascal Richette; Matthieu Resche-Rigon; Thomas Bardin
Journal:  Arthritis Res Ther       Date:  2020-08-03       Impact factor: 5.156

7.  Allele and haplotype frequencies of human leukocyte antigen-A, -B, -C, -DRB1, -DRB3/4/5, -DQA1, -DQB1, -DPA1, and -DPB1 by next generation sequencing-based typing in Koreans in South Korea.

Authors:  In-Cheol Baek; Eun-Jeong Choi; Dong-Hwan Shin; Hyoung-Jae Kim; Haeyoun Choi; Tai-Gyu Kim
Journal:  PLoS One       Date:  2021-06-21       Impact factor: 3.240

8.  Association of HLA-B Gene Polymorphisms with Type 2 Diabetes in Pashtun Ethnic Population of Khyber Pakhtunkhwa, Pakistan.

Authors:  Asif Jan; Muhammad Saeed; Muhammad Hussain Afridi; Fazli Khuda; Muhammad Shabbir; Hamayun Khan; Sajid Ali; Muhammad Hassan; Rani Akbar
Journal:  J Diabetes Res       Date:  2021-06-16       Impact factor: 4.011

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.