| Literature DB >> 29326702 |
Antti Larjo1, Robert Eveleigh2, Elina Kilpeläinen1, Tony Kwan2, Tomi Pastinen2, Satu Koskela1, Jukka Partanen1.
Abstract
The human leukocyte antigen (HLA) genes code for proteins that play a central role in the function of the immune system by presenting peptide antigens to T cells. As HLA genes show extremely high genetic polymorphism, HLA typing at the allele level is demanding and is based on DNA sequencing. Determination of HLA alleles is warranted as HLA alleles are major genetic risk factors in autoimmune diseases and are matched in transplantation. Here, we compared the accuracy of several published HLA-typing algorithms that are based on next-generation sequencing (NGS) data. As genome sequencing is becoming increasingly common in research, we wanted to test how well HLA alleles can be deduced from genome data produced in studies with objectives other than HLA typing and in platforms not especially designed for HLA typing. The accuracies were assessed using datasets consisting of NGS data produced using an in-house sequencing platform, including the full 4 Mbp HLA segment, from 94 stem cell transplantation patients and exome sequences from 63 samples of the 1000 Genomes collection. In the patient dataset, none of the software gave perfect results for all the samples and genes when programs were used with the default settings. However, we found that ensemble prediction of the results or modifications of the settings could be used to improve accuracy. For the exome-only data, most of the algorithms did not perform very well. The results indicate that the use of these algorithms for accurate HLA allele determination is not straightforward when based on NGS data not especially targeted to the HLA typing and their accurate use requires HLA expertise.Entities:
Keywords: genetic variation; genome sequence; histocompatibility; human leukocyte antigen alleles; transplantation
Year: 2017 PMID: 29326702 PMCID: PMC5733459 DOI: 10.3389/fimmu.2017.01815
Source DB: PubMed Journal: Front Immunol ISSN: 1664-3224 Impact factor: 7.561
Figure 1Accuracy of human leukocyte antigen (HLA) interpretation programs to determine HLA class I alleles in the Finnish Red Cross Blood Service dataset comprising 93 Finnish individuals. Concordance rate to standard clinical HLA typing togehter with ensemble results are shown for each program package
Summary of accuracies of the HLA interpretation programs and the ensemble result based on Finnish Red Cross Blood Service dataset comprising 93 samples.
| Program | Concordant result | Different | Accuracy % | |
|---|---|---|---|---|
| Class I | HLAssign | 538 | 20 | 96.42 |
| HLAreporter | 533 | 25 | 95.52 | |
| ATHLATES | 556 | 2 | 99.64 | |
| Optitype | 555 | 3 | 99.46 | |
| Omixon Target | 554 | 3 | 99.28 | |
| Ensemble | 558 | 0 | 100 | |
| Class II | HLAssign | 703 | 45 | 93.98 |
| HLAreporter | 836 | 15 | 98.24 | |
| ATHLATES | 703 | 147 | 82.24 | |
| Optitype | na | na | na | |
| Omixon Target | 822 | 28 | 96.71 | |
| Ensemble | 848 | 4 | 99.53 | |
Concordance rates to standard clinical human leukocyte antigen (HLA) typing are shown.
Figure 2Accuracy of human leukocyte antigen (HLA) interpretation programs to type HLA class II alleles in the Finnish Red Cross Blood Service dataset comprising 93 Finnish individuals. Concordance rate to standard clinical HLA typing together with ensemble results are shown for each program package.
Sequencing depth and coverage in three samples with discrepant results in human leukocyte antigen (HLA) interpretation by the Omixon Explore software.
| Exons covered | Depth range | Exon 1 | Exon 2 | Exon 3 | Exon 4 | Exon 5 | Exon 6 | Exon 7 | Omixon Target | Omixon Explore | Reference method | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5 | 3–279 | 37–305 | 3–105 | 51–446 | 40–307 | 155–413 | na | na | ||||
| HLA-DQB1 allele 1 | 4 | Average 134 | 167 | 89 | 148 | 178 | – | na | na | 02:01 | 02:01 | 02:01 |
| HLA-DQB1 allele 2 | 4 | Average 138 | 167 | 89 | 148 | 178 | – | na | na | 02:06 | 02:02 | 02:02 |
| HLA-DQB1 allele 1 | 4 | Average 173 | 189 | 77 | 245 | 205 | – | na | na | 03:01 | 03:01 | 03:01 |
| HLA-DQB1 allele 2 | 2 | Average 118 | – | 77 | 158 | – | – | na | na | 03:22 | 03:22 | 03:01 |
| 7 | 16–322 | 4–206 | 5–239 | 4–305 | 4–463 | 12–426 | 1–389 | 1–366 | ||||
| HLA-A allele 1 | 6 | Average 322 | 206 | 239 | 305 | 463 | 426 | – | 1 | 02:01 | 02:01 | 02:01 |
| HLA-A allele 2 | 6 | Average 322 | 206 | 239 | 305 | 463 | 426 | – | 1 | 02:197 | 02:01 | 02:01 |
Samples FRC13 and FRC36 had an HLA-DQB1 discrepancy and sample FRC37 had an HLA-A discrepancy.
Summary of accuracies of the human leukocyte antigen (HLA) interpretation programs for 1000 Genomes samples.
| Program | Number of alleles | Correct (all alleles) | Accuracy (all alleles) (%) | Missing alleles (%) | Correct (intersection | Accuracy (intersection) (%) |
|---|---|---|---|---|---|---|
| HLAreporter | 630 | 202 | 32.06 | 190 (30.16) | 91 | 55.49 |
| ATHLATES | 630 | 468 | 74.29 | 78 (12.38) | 137 | 83.54 |
| Optitype | 378 | 372 | 98.41 | 0 (0.00) | 161 | 98.17 |
Missing allele refers to either a completely missing call or one for which no consensus could be formed based on a list of detected alleles. Intersection denotes the set of alleles for all sample–gene pairs that had a non-missing typing in all tested typing programs. No results could be achieved using HLAssign.