| Literature DB >> 31207125 |
Nicoletta Sacchi1, Mauro Castagnetta1, Valeria Miotti2, Lucia Garbarino1, Annamaria Gallina1.
Abstract
HLA genes are highly polymorphic and structurally complex. They are located in the major histocompatibility complex (MHC) on chromosome 6, and the frequency of alleles and haplotypes varies widely among human populations. In this paper, we calculated the allele and haplotype frequencies using the HLA data of more than 120 000 Italian unrelated bone marrow donors enrolled in the national registry (IBMDR) and typed them with a high-resolution (HR) method for the HLA-A, -B, -C and -DRB1 alleles. The allele frequency data were obtained by manual counting; haplotype frequencies were calculated using the expectation maximisation (EM) algorithm. The total numbers of observed alleles were 226 for HLA-A, 343 for HLA-B, 201 for HLA-C and 210 for HLA-DRB1, which account for 5.4%, 6.7%, 5.2% and 8.5%, respectively, of each locus allele (IPD-IMGT/HLA Database Release 3.32, April 2018). The three most frequent Italian haplotypes were HLA-A*01:01~B*08:01~C*07:01~DRB1*03:01 (2.5%), A*02:01~B*18:01~C*07:01~DRB1*11:04 (1.1%) and A*30:01~B*13:02~C*06:02~DRB1*07:01 (1.1%). Moreover, for a relevant subset of the examined population (>100 000 individuals), the birthplace was available, and thus, we grouped the frequency data based on the corresponding Italian geographic areas, describing the HLA specificity of the Italian regional populations. The haplotype frequencies were also compared between national and regional data, and we observed remarkable differences in the regional haplotype frequencies, particularly in Sardinia. This study represents a valid tool to identify a more efficient haematopoietic stem cell unrelated donor recruitment and selection strategy, as well as for population genetic and HLA-disease association fields.Entities:
Keywords: frequency; haplotype; regions; unrelated donors
Mesh:
Substances:
Year: 2019 PMID: 31207125 PMCID: PMC6771744 DOI: 10.1111/tan.13613
Source DB: PubMed Journal: HLA ISSN: 2059-2302 Impact factor: 4.513
Characteristics of the data sets
| Sample size | Gender | Age | |||||
|---|---|---|---|---|---|---|---|
| Males (%) | 18‐25 | 26‐35 | 36‐45 | 46‐55 | Median (25th‐75th) | ||
| Data set 1 | 120 926 | 45.6 | 44 466 (36.8%) | 49 582 (41.0%) | 23 760 (19.6%) | 3118 (2.6%) | 28 (23‐35) |
| Data set 2 | 104 135 | 45.9 | 37 459 (36.0%) | 43 254 (41.5%) | 20 761 (19.9%) | 2661 (2.6%) | 28 (24‐35) |
| Data set 3 | 55 538 | 46.8 | 18 496 (33.3%) | 22 441 (40.4%) | 12 167 (21.9%) | 2434 (4.4%) | 29 (24‐36) |
The first 20 frequency‐ranked HLA‐A, ‐B, ‐C and ‐DRB1 alleles
| Rank | HLA‐A | HLA‐B | HLA‐C | HLA‐DRB1 | ||||
|---|---|---|---|---|---|---|---|---|
| Allele | Freq. (%) | Allele | Freq. (%) | Allele | Freq. (%) | Allele | Freq. (%) | |
| 1 | 02:01g | 22.820 | 51:01g | 9.798 | 04:01g | 17.526 | 07:01g | 12.525 |
| 2 | 24:02g | 12.286 | 18:01g | 9.523 | 07:01g | 17.108 | 11:01g | 11.632 |
| 3 | 01:01g | 11.528 | 35:01g | 8.004 | 06:02g | 9.921 | 11:04g | 10.067 |
| 4 | 03:01g | 10.628 | 08:01g | 5.760 | 12:03g | 9.028 | 03:01g | 9.472 |
| 5 | 11:01g | 5.973 | 07:02g | 5.239 | 07:02g | 6.582 | 01:01g | 5.987 |
| 6 | 32:01g | 5.217 | 44:02g | 4.402 | 05:01g | 5.773 | 15:01g | 5.874 |
| 7 | 26:01g | 4.585 | 44:03g | 3.828 | 02:02g | 4.600 | 14:01g | 5.389 |
| 8 | 68:01g | 2.939 | 49:01g | 3.548 | 15:02g | 3.845 | 13:01g | 5.135 |
| 9 | 30:01g | 2.697 | 38:01g | 3.451 | 03:03g | 3.604 | 16:01g | 4.957 |
| 10 | 23:01g | 2.661 | 13:02g | 3.419 | 08:02g | 3.413 | 13:02g | 4.937 |
| 11 | 29:02g | 2.556 | 35:03g | 3.372 | 01:02g | 3.381 | 01:02g | 2.447 |
| 12 | 31:01g | 2.336 | 15:01g | 3.116 | 16:01g | 2.533 | 08:01g | 2.017 |
| 13 | 30:02g | 1.873 | 14:02g | 2.931 | 14:02g | 2.315 | 10:01g | 1.765 |
| 14 | 02:05g | 1.839 | 35:02g | 2.749 | 03:04g | 1.932 | 04:03g | 1.642 |
| 15 | 33:01g | 1.745 | 57:01g | 2.559 | 12:02g | 1.637 | 04:01g | 1.636 |
| 16 | 25:01g | 1.645 | 58:01g | 1.976 | 17:01g | 1.488 | 11:03g | 1.572 |
| 17 | 68:02g | 0.802 | 50:01g | 1.954 | 07:04g | 1.253 | 15:02g | 1.396 |
| 18 | 03:02g | 0.792 | 55:01g | 1.936 | 16:02g | 1.114 | 04:02g | 1.381 |
| 19 | 29:01g | 0.758 | 39:01g | 1.925 | 15:05g | 0.915 | 04:05g | 1.379 |
| 20 | 33:03g | 0.623 | 52:01g | 1.628 | 03:02g | 0.689 | 12:01g | 1.335 |
Figure 1Cumulative HLA‐A, ‐B, ‐C, ‐DRB1 allele frequencies
Number of common (C) and well‐documented (WD) alleles (CWD) in 55 538 Italian individuals
| Locus | C | WD | no CWD | CWD | Total number of HLA alleles | % CWD |
|---|---|---|---|---|---|---|
| A | 30 | 38 | 84 | 68 | 152 | 44.7 |
| B | 50 | 72 | 126 | 122 | 248 | 49.2 |
| C | 24 | 33 | 77 | 57 | 134 | 42.5 |
| DRB1 | 35 | 36 | 87 | 71 | 158 | 44.9 |
| Total | 139 | 179 | 374 | 318 | 692 | 46.0 |
Comparison between the numbers of Italian common (C) and well‐documented (WD) alleles (CWD), ASHI criteria vs. EFI catalogue
| EFI CWD catalogue | |||||
|---|---|---|---|---|---|
| Italian CWD according to ASHI | C | WD | No CWD | ||
| Overall | C | 139 |
| 7 | 0 |
| WD | 179 | 35 |
| 69 | |
| No CWD | 374 | 5 | 125 |
| |
| Total | 692 | 172 | 207 | 313 | |
| Locus A | C | 30 |
| 2 | 0 |
| WD | 38 | 5 |
| 16 | |
| No CWD | 84 | 2 | 34 |
| |
| Total | 152 | 35 | 53 | 64 | |
| Locus B | C | 50 |
| 4 | 0 |
| WD | 72 | 15 |
| 25 | |
| No CWD | 126 | 1 | 42 |
| |
| Total | 248 | 62 | 78 | 108 | |
| Locus C | C | 24 |
| 1 | 0 |
| WD | 33 | 7 |
| 13 | |
| No CWD | 77 | 2 | 25 |
| |
| Total | 134 | 32 | 39 | 63 | |
| Locus DRB1 | C | 35 |
| 0 | 0 |
| WD | 36 | 8 |
| 15 | |
| No CWD | 87 | 0 | 24 |
| |
| Total | 158 | 43 | 37 | 78 | |
The 50 most common haplotypes observed in the Italian population
| Rank | HLA haplotype | Frequency | |||
|---|---|---|---|---|---|
| A | B | C | DRB1 | ||
| 1 | 01:01g | 08:01g | 07:01g | 03:01g | 0.025357 |
| 2 | 02:01g | 18:01g | 07:01g | 11:04g | 0.011435 |
| 3 | 30:01g | 13:02g | 06:02g | 07:01g | 0.010879 |
| 4 | 29:02g | 44:03g | 16:01g | 07:01g | 0.010829 |
| 5 | 03:01g | 07:02g | 07:02g | 15:01g | 0.010167 |
| 6 | 33:01g | 14:02g | 08:02g | 01:02g | 0.009459 |
| 7 | 24:02g | 35:02g | 04:01g | 11:04g | 0.009276 |
| 8 | 30:02g | 18:01g | 05:01g | 03:01g | 0.008925 |
| 9 | 03:01g | 35:01g | 04:01g | 01:01g | 0.007673 |
| 10 | 01:01g | 57:01g | 06:02g | 07:01g | 0.006078 |
| 11 | 11:01g | 35:01g | 04:01g | 01:01g | 0.005041 |
| 12 | 23:01g | 44:03g | 04:01g | 07:01g | 0.004771 |
| 13 | 02:01g | 13:02g | 06:02g | 07:01g | 0.004577 |
| 14 | 02:01g | 35:01g | 04:01g | 14:01g | 0.004511 |
| 15 | 02:01g | 07:02g | 07:02g | 15:01g | 0.004326 |
| 16 | 11:01g | 35:01g | 04:01g | 14:01g | 0.004191 |
| 17 | 24:02g | 18:01g | 12:03g | 11:04g | 0.004070 |
| 18 | 02:01g | 18:01g | 05:01g | 03:01g | 0.003987 |
| 19 | 02:05g | 50:01g | 06:02g | 07:01g | 0.003841 |
| 20 | 02:01g | 08:01g | 07:01g | 03:01g | 0.003554 |
| 21 | 24:02g | 18:01g | 07:01g | 11:04g | 0.003441 |
| 22 | 23:01g | 49:01g | 07:01g | 11:01g | 0.003120 |
| 23 | 26:01g | 38:01g | 12:03g | 13:01g | 0.003038 |
| 24 | 02:05g | 58:01g | 07:01g | 16:01g | 0.002856 |
| 25 | 24:02g | 07:02g | 07:02g | 15:01g | 0.002715 |
| 26 | 02:01g | 51:01g | 15:02g | 11:01g | 0.002711 |
| 27 | 02:01g | 57:01g | 06:02g | 07:01g | 0.002604 |
| 28 | 01:01g | 52:01g | 12:02g | 15:02g | 0.002543 |
| 29 | 01:01g | 15:17g | 07:01g | 13:02g | 0.002512 |
| 30 | 02:01g | 44:02g | 05:01g | 04:01g | 0.002458 |
| 31 | 02:01g | 44:02g | 05:01g | 11:01g | 0.002458 |
| 32 | 25:01g | 18:01g | 12:03g | 15:01g | 0.002451 |
| 33 | 01:01g | 35:02g | 04:01g | 11:04g | 0.002418 |
| 34 | 02:01g | 51:01g | 01:02g | 11:01g | 0.002297 |
| 35 | 11:01g | 52:01g | 12:02g | 15:02g | 0.002279 |
| 36 | 02:01g | 18:01g | 07:01g | 11:01g | 0.002149 |
| 37 | 02:01g | 44:02g | 05:01g | 13:01g | 0.002074 |
| 38 | 24:02g | 13:02g | 06:02g | 07:01g | 0.002074 |
| 39 | 01:01g | 37:01g | 06:02g | 10:01g | 0.002046 |
| 40 | 02:01g | 39:01g | 12:03g | 16:01g | 0.001893 |
| 41 | 24:02g | 08:01g | 07:01g | 03:01g | 0.001840 |
| 42 | 11:01g | 35:01g | 04:01g | 11:01g | 0.001839 |
| 43 | 02:01g | 38:01g | 12:03g | 13:01g | 0.001800 |
| 44 | 68:01g | 44:02g | 07:04g | 11:01g | 0.001793 |
| 45 | 03:01g | 07:02g | 07:02g | 11:01g | 0.001788 |
| 46 | 02:01g | 51:01g | 14:02g | 11:01g | 0.001776 |
| 47 | 02:01g | 51:01g | 02:02g | 11:01g | 0.001755 |
| 48 | 02:01g | 18:01g | 12:03g | 11:04g | 0.001706 |
| 49 | 01:01g | 08:01g | 07:01g | 11:01g | 0.001701 |
| 50 | 02:01g | 35:01g | 04:01g | 01:01g | 0.001643 |
Figure 2Sum of squares (SS) frequency differences between each region and the national data
Figure 3Comparison between national and regional haplotype frequencies (per thousand)