| Literature DB >> 34529780 |
Ray Sajulga1, Yung-Tsi Bolon1, Martin J Maiers2, Effie W Petersdorf3,4.
Abstract
Sequence variation in the HLA-B gene is critically linked to differential immune responses. A dimorphism at -21 of HLA-B exon 1 gives rise to leader peptides that are markers for risk of acute graft-versus-host disease, relapse, and mortality after unrelated donor and cord blood transplantation. To optimize the selection of stem cell transplant sources based on the HLA-B leader, an HLA-BLeader Assessment Tool (BLEAT) was developed to automate the assignment of leader genotypes, define HLA-B leader match statuses, and rank order candidate stem cell sources according to clinical risk. The base cohort consisted of 9 417 614 registered donors from the Be The Match Registry with HLA-B typing. Among these donors, the performance of BLEAT was assessed in 1 098 358 donors with sequence data for HLA-B exon 1 (2 196 716 haplotypes). The accuracy of leader assignment was then assessed in a second cohort of 1259 patients and their unrelated transplant donors. We furthermore established the frequencies of HLA-B leader genotype (MM, MT, TT) representations in broad racial categories in the 9.42 million donors. BLEAT has direct applications for the selection of optimal stem cell sources for transplantation and broad utility in basic and clinical research in pharmacogenomics, vaccine development, and cancer and infectious disease studies of human populations.Entities:
Mesh:
Substances:
Year: 2022 PMID: 34529780 PMCID: PMC8753210 DOI: 10.1182/bloodadvances.2021004561
Source DB: PubMed Journal: Blood Adv ISSN: 2473-9529
HLA-B leader genotype and allele frequencies in 5 US continental races in the base cohort
| Donor race categories | No. of donors | Leader allele frequencies (%) | Leader genotype frequencies (%) | |||
|---|---|---|---|---|---|---|
| T | M | TT | MT | MM | ||
| White | 5 224 777 | 67.8 | 32.2 | 46.2 | 43.3 | 10.5 |
| Hispanic or Latino | 1 289 395 | 72.7 | 27.3 | 53.0 | 39.3 | 7.7 |
| Asian or Pacific Islander | 951 723 | 85.6 | 14.4 | 73.6 | 24.0 | 2.4 |
| Black or African American | 837 531 | 75.4 | 24.6 | 56.9 | 36.9 | 6.1 |
| American Indian or Alaskan Native | 79 858 | 71.1 | 28.9 | 51.0 | 40.1 | 8.8 |
Allele frequencies across populations vary for T (67.8%-85.6%) and M (14.4%-32.2%), and genotype frequencies vary for TT (51.0%-73.6%), MT (24.0%-43.3%), and MM (2.4%-10.5%). Genotype frequencies were found to differ significantly across races using Pearson's χ2 test of independence (8, N = 9 417 614) = 273 178, P < 2.2 × 10-16. The base cohort total is 9 417 614. There are 1 034 330 omitted donors with a race that is unknown, multirace, other, or was declined to be reported.
Unique exon 1 sequences in cohort 1
| Deduced leader peptide | Nucleotide sequence for nonamer peptide | Total (% of total cohort | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | ||
| V | GTC |
| GCG | CCC | CGA | ACC | GTC | CTC | CTG | 683 153 (31.1) |
| V | — |
| — | — | — | — | — | — | — | 670 904 (30.5) |
| V | — |
| — | — | — | — | C- | — | — | 542 311 (24.7) |
| V | — |
| -A | — | — | — | — | — | — | 242 366 (11.0) |
| V | — |
| -A | — | — | — | C- | — | — | 57 747 (2.6) |
| V | — |
| -A- | — | — | — | C- | — | — | 107 |
| V | — |
| — | — | A- | — | — | — | — | 25 |
| V | — |
| -A | — | — | — | — | — | — | 21 |
| V | — |
| — | — | — | — | — | — | — | 16 |
| V | — |
| -T | — | — | — | C- | — | — | 10 |
| V | — |
| -A | — | — | — | C- | T- | — | 7 |
| V | — |
| — | T- | — | — | C- | — | — | 6 |
| V | — |
| — | — | — | — | C- | — | — | 5 |
| V | — |
| — | T- | — | — | — | — | — | 4 |
| V | — |
| -A | — | — | — | C- | — | — | 3 |
| V | — |
| — | — | — | — | A- | — | — | 3 |
| V | — |
| — | — | — | -A- | C-– | — | — | 3 |
| V | — |
| — | — | — | — | — | — | T- | 3 |
| V | — |
| -A | — | — | — | — | — | -A | 3 |
| V | — |
| — | — | -A- | — | — | — | — | 3 |
| V | — |
| — | A- | — | — | C- | — | — | 2 |
| V | — |
| — | -T- | — | — | — | — | — | 1 |
| V | — |
| — | — | — | — | C- | G- | — | 1 |
| I | A- |
| — | — | — | — | — | — | — | 1 |
| V | — |
| -A- | — | — | — | — | — | — | 1 |
| V | — |
| — | — | T- | — | C- | — | — | 1 |
| V | — |
| — | — | — | — | C- | — | -A | 1 |
| V | — |
| — | — | — | — | T– | — | — | 1 |
| V | — |
| — | A- | — | — | — | — | — | 1 |
| V | — |
| — | — | G- | — | — | — | — | 1 |
| V | — | -C- | — | — | — | — | — | — | -A- | 1 |
| V | — | — | — | — | — | -G- | — | — | — | 1 |
| V | — | G- | — | — | — | — | — | — | — | 1 |
| V | — | — | -T- | — | — | — | — | — | — | 1 |
| V | — | — | — | — | — | — | — | — | G- | 1 |
| Total | 2 196 716 | |||||||||
The 1 098 358 donors in cohort 1 contain 2 196 716 observed exon 1 nucleotide sequences (each of which encodes a deduced peptide sequence). Each row details a unique nucleotide sequence. Nucleotide sequence rows are sorted by decreasing observations. Nucleotide sequences are aligned with the most frequent sequence; hyphens indicate consensus nucleotides. The asterisk within VTAP*TLLL indicates a deduced stop codon. The P2 sequence is indicated in bold, which is deduced to encode thymine (T) (1 513 499 observations); methionine (M) (683 213 observations); arginine (R) (three observations), or valine (V) (one observation). Percentages are provided for the top 5 peptides, which represent 99.99% of all observations. Race information for each peptide is provided in Table 4. Deduced leader peptides with multiple unique nucleotide sequences are annotated with footnotes.
These deduced leader peptides have 4 unique exon 1 sequences each in cohort 1.
Summary of HLA-B allele families and their linked leader types in cohort 1
| Existing reference alleles (IMGT) | Genomic sequence-based observations (cohort 1) | ||||
|---|---|---|---|---|---|
| Allele family (leader) | Minor allele(s) | Major alleles | Minor allele(s) | Total | |
| Count | Count (broad race | ||||
| 251 053 | 251 055 | ||||
| 1 (White) | |||||
| 1 (White) | |||||
| — | 199 301 | 1 (White) | 199 302 | ||
| 54 456 | 1 (Asian or Pacific Islander) | 54 457 | |||
| — | 84 890 | — | — | 84 890 | |
| 190 429 | 2 (Asian or Pacific Islander, multirace) | 190 431 | |||
| — | 95 598 | 2 (White) | 95 600 | ||
| — | 78 897 | — | — | 78 897 | |
| — | 235 193 | 33 (31 White, 2 multirace) | 235 226 | ||
| — | 29 684 | — | — | 29 684 | |
| — | 45 774 | — | — | 45 774 | |
| — | 69 503 | — | — | 69 504 | |
| 142 805 | 2 (White, American Indian or Alaskan Native) | 142 807 | |||
| — | 17 724 | — | — | 17 724 | |
| — | 11 772 | — | — | 11 772 | |
| — | 215 355 | — | — | 215 355 | |
| — | 17 870 | — | — | 17 870 | |
| — | 9781 | — | — | 9781 | |
| — | 4074 | — | — | 4074 | |
| — | 15 407 | — | — | 15 407 | |
| — | 30 234 | — | — | 30 234 | |
| — | 21 137 | — | — | 21 137 | |
| — | 126 774 | — | — | 126 774 | |
| 37 355 | 1 (White) | 37 356 | |||
| — | 29 726 | — | — | 29 726 | |
| — | 4116 | — | — | 4116 | |
| 37 710 | 3 (2 Asian or Pacific Islander, 1 White) | 37 713 | |||
| 13 845 | — | — | 13 845 | ||
| — | 78 082 | — | — | 78 082 | |
| 39 521 | 1 (multirace) | 39 522 | |||
| — | 646 | — | — | 646 | |
| — | 698 | — | — | 698 | |
| — | 1160 | — | — | 1160 | |
| — | 1986 | — | — | 1986 | |
| — | 3613 | — | — | 3613 | |
| — | 494 | — | — | 494 | |
| — | 4 | — | — | 4 | |
| Total | 2 196 667 | 49 | 2 196 716 | ||
Reference HLA-B alleles and leader alleles were obtained from IPD-IMGT/HLA version 3.42.0.35 Each first-field HLA-B family is characterized by a major leader allele. A total of 12 minor alleles are found among 8 first-field families. Genomic sequence-based observations of cohort 1 confirmed eight of these 12 minor alleles and additionally detected evidence for another 3 HLA-B alleles with minor leader alleles at exon 1. Minor alleles represented 0.0022% of the leader alleles in cohort 1.
There are 5 possible broad race categorizations (Black or African American, Asian or Pacific Islander, White, Hispanic or Latin American, and American Indian or Alaskan Native) and 4 miscellaneous categories (multirace, unknown, other, and declined). Multirace applies to those who have more than one broad race listed. Additional details are listed in the Materials and Methods.
Four rare, minor alleles encoded novel polymorphisms outside of exons 2 through 4. Three of these alleles encoded minor leader alleles in families with no known prior minor leader alleles. These 4 observed sequences had novel polymorphisms outside of exons 2 through 4 (genomic positions 201-471; 716-992; 1566-1842; respectively) compared with reference sequences first described by IPD-IMGT/HLA version 3.42.0.[1] The closest-matching allele was used as a placeholder. Note that the genomic position of P2 on the leader allele spans 10 through 12. The novel polymorphisms for HLA-B*07:390 are −18A>G; 15G>A; 2892T>C; and 2904G>C. HLA-B*08:207 has 11T>C. HLA-B*18:01:01 has −18A>G and 11C>T (GenBank accession MH173353 for the unique sequence). HLA-B*35:01:01 has 18A>G and 11C>T (GenBank accessions MG756798, MH973951, and MG769755 for unique sequences).
Broad race categories for leader peptides observed in cohort 1
| Deduced leader peptide | Hispanic or Latino | Asian or Pacific Islander | Black or African American | American Indian or Alaskan Native | Multirace | Unknown | Total | |
|---|---|---|---|---|---|---|---|---|
| White | Percentage by leader peptide’s total, horizontally (%); | |||||||
| V | 64.5; | 15.9; | 3.6; | 3.8; | 0.3; | 11.1; | 0.9; | 683 153 (31.1) |
| V | 47.8; | 19.6; | 11.4; | 7.2; | 0.4; | 12.9; | 0.6; | 670 904 (30.5) |
|
|
|
|
|
| 33.5 |
| ||
| V | 63.2; | 15.2; | 7.0; | 2.3; | 0.4; | 11.1; | 0.9; | 542 311 (24.7) |
|
|
|
|
|
|
|
| ||
| V | 61.9; | 12.7; | 6.7; | 5.8; | 0.3; | 11.8; | 0.9; | 242 366 (11.0) |
|
|
|
|
|
|
|
| ||
| V | 59.5; | 8.8; | 16.9; | 1.6; | 0.3; | 12.1; | 0.7; | 57 747 (2.6) |
|
|
|
|
|
|
|
| ||
| V | 77.6 | 10.3 | — | 0.9 | — | 11.2 | — | 107 |
| V | 92.0 | 4.0 | — | — | — | 4.0 | — | 25 |
| V | 9.5 | — | 71.4 | — | 4.8 | 14.3 | — | 21 |
| V | 6.3 | 31.3 | — | 56.3 | — | 6.3 | — | 16 |
| V | 10.0 | 50.0 | 40.0 | — | — | — | — | 10 |
| V | 100.0 | — | — | — | — | — | — | 7 |
| V | — | 50.0 | — | 16.7 | — | 33.3 | — | 6 |
| V | 80.0 | — | 20.0 | — | — | — | — | 5 |
| V | 75.0 | — | — | — | — | 25.0 | — | 4 |
| V | 33.3 | — | 66.7 | — | — | — | — | 3 |
| V | 100.0 | — | — | — | — | — | — | 3 |
| V | 100.0 | — | — | — | — | — | — | 3 |
| V | 100.0 | — | — | — | — | — | — | 3 |
| V | — | — | 66.7 | — | — | 33.3 | — | 3 |
| V | 33.3 | — | — | — | — | 66.7 | — | 3 |
| V | 100.0 | — | — | — | — | — | — | 2 |
| V | — | 100.0 | — | — | — | — | — | 1 |
| V | 100.0 | — | — | — | — | — | — | 1 |
| I | — | — | — | — | — | — | 100.0 | 1 |
| V | 100.0 | — | — | — | — | — | — | 1 |
| V | 100.0 | — | — | — | — | — | — | 1 |
| V | — | 100.0 | — | — | — | — | — | 1 |
| V | 100.0 | — | — | — | — | — | — | 1 |
| V | — | 100.0 | — | — | — | — | — | 1 |
| V | — | — | 100.0 | — | — | — | — | 1 |
| V | 100.0 | — | — | — | — | — | — | 1 |
| V | — | — | — | — | — | 100.0 | — | 1 |
| V | 100.0 | — | — | — | — | — | — | 1 |
| V | 100.0 | — | — | — | — | — | — | 1 |
| V | — | — | 100.0 | — | — | — | — | 1 |
| Total (% by cohort total) | 1 288 740 (58.7%) | 358 322 (16.3%) | 164 740 (7.5%) | 101 512 (4.6%) | 7104 (0.3%) | 258 404 (11.8%) | 17 592 (0.8%) | 2 196 716 |
The deduced leader peptides from Table 1 are related to the corresponding donor’s broad race. The first percentages are based on the leader peptide’s total (horizontal). The top 5 peptides have percentages based on the broad race’s total (vertical) listed second. Deduced leader peptides with multiple unique nucleotide sequences are annotated with footnotes. The cohort total is 2 196 716.
These deduced leader peptides have four unique exon 1 sequences each in cohort 1.
Figure 1.BLEAT leader assignment. Two scenarios are depicted using the 2 HLA-B alleles from 1 subject to illustrate a simple use case (HLA-B*07:06:01) and a more complex use case (HLA-B*56:01:01G). For clarity, only the last 5 alleles (by number) are displayed for HLA-B*56:01:01G.[36] (A) The leader peptide is deduced from the exon 1 sequence. (B) The deduced leader peptide is available from IMGT in the Anthony Nolan HLA Informatics Group GitHub repository.[35] (C) This information is processed and reflected on BLEAT’s user interface to display P2 information. The user may hover over the P2 “Leader” icon to reveal a tooltip (D) that organizes all potential HLA-B alleles from the provided allele into known and unknown major (and any minor, if applicable) leader alleles. Rare minor leader alleles are highlighted with red on the user interface.
Figure 2.BLEAT provides a rank order for candidate stem cell sources (“donors”) to optimize therapy selection and patient outcomes. The BLEAT user interface (left) has an infographic (right) that is toggled via the Help button (A). On the interface, users enter HLA-B typing information for each subject. (A) The patient’s leader genotype (MM, MT, or TT) is automatically classified upon entering the patient’s HLA-B typing. Ambiguous alleles can sometimes contain rare leader variants; the tool highlights this possibility in red. (B) HLA-B allele typing for donors can be manually entered or automatically imported. Before or during sorting, (C) the leader match status for each donor (matched with the patient) is calculated. Once calculated, a donor selection guide (D) highlights each selected donor’s relative level of risk for acute GVHD based on the patient’s leader genotype and published outcomes.[13] The match status for each donor (E) is displayed through HapLogic match grades (A, allele match; P, potential match; L, allele mismatch; M, antigen mismatch)[37] and leader match status (eg, MTT, a 3-letter nomenclature used to designate the leader of the patient’s mismatched HLA-B [first letter], the leader of the donor’s mismatched HLA-B [second letter], and the leader of the shared/matched HLA-B [third letter]).[13]
Figure 3.Utility of BLEAT for diverse applications. The assignment of the leader genotype to the HLA-B allele may be useful for downstream applications by HLA laboratories, clinical researchers, donor registries, cell source banks, and health care workers. For HLA laboratories involved in basic research, these include exploration of allele diversity and haplotype linkage involving HLA-B to examine HLA structure and function. Laboratories may also supply this information for clinical research pursuits in understanding historical patient risk or modeling future risk to improve outcomes. Health care workers can apply BLEAT toward the selection of optimal donors based on transplant risk that can also be facilitated by donor registries or cell source banks. The assessment of HLA-B allele diversity can also aid in ensuring that therapies are available for diverse populations and enable individualized treatment.