| Literature DB >> 31240105 |
Yen-Yen Wang1,2, Takahiro Mimori2, Seik-Soon Khor3,4, Olivier Gervais2,5,6, Yosuke Kawai3,4, Yuki Hitomi3,7, Katsushi Tokunaga3,4, Masao Nagasaki1,2,5,6.
Abstract
HLA-VBSeq is an HLA calling tool developed to infer the most likely HLA types from high-throughput sequencing data. However, there is still room for improvement in specific genetic groups because of the diversity of HLA alleles in human populations. Here, we present HLA-VBSeq v2, a software application that makes use of a new Japanese HLA reference panel to enhance calling accuracy for Japanese HLA class-I genes. Our analysis showed significant improvements in calling accuracy in all HLA regions, with prediction accuracies achieving over 99.0, 97.8, and 99.8% in HLA-A, B and C, respectively.Entities:
Keywords: Data publication and archiving; Next-generation sequencing
Year: 2019 PMID: 31240105 PMCID: PMC6584547 DOI: 10.1038/s41439-019-0061-y
Source DB: PubMed Journal: Hum Genome Var ISSN: 2054-345X
Comparison of the prediction accuracy between the software programs studied for each dataset and HLA gene region
| Dataset | Gene ( | HLA-VBSeq, % | HLA-VBSeq v2, % | HLA*PRG:LA, % | |
|---|---|---|---|---|---|
| SJS | HLA-A | 228 | 98.2 | 98.2 | |
| HLA-B | 228 | 92.5 | 97.8 | ||
| HLA-C | 228 | 90.4 | 98.7 | ||
| THC | HLA-A | 836 | 98.4 | 98.1 | |
| HLA-B | 836 | 91.3 | 98.3 | ||
| HLA-C | 818 | 92.1 | 98.4 | ||
n corresponds to the number of alleles in each dataset
THC Tokyo Healthy Controls data set, SJS Stevens–Johnson syndrome data set
Bold font corresponds to the highest prediction accuracy for each HLA gene
Details of the calling results inconsistent with the “true” HLA types. “fail” indicates that the program failed to reach an appropriate calling result
| Discordance ( | |||||||
|---|---|---|---|---|---|---|---|
| Dataset | gene | True type ( | HLA-VBSeq | HLA-VBSeq v2 | |||
| SJS | A | A*02:01 | 33 | A*02:06 | 2 | A*02:06 | 1 |
| A*24:02 | 55 | fail | 1 | ||||
| A*26:02 | 3 | A*26:01 | 1 | ||||
| A*31:01 | 18 | A*33:03 | 1 | ||||
| B | B*15:01 | 8 | B*46:01 | 1 | |||
| B*15:27 | 1 | B*15:01 | 1 | ||||
| B*40:06 | 8 | B*40:02 | 5 | ||||
| B*51:01 | 21 | B*51:02 | 4 | B*51:02 | 3 | ||
| B*54:01 | 8 | B*55:01 | 1 | ||||
| B*55:02 | 4 | B*55:01 | 2 | ||||
| B*56:01 | 3 | B*55:01 | 3 | ||||
| B*59:04 | 1 | B*59:01 | 1 | B*59:01 | 1 | ||
| C | C*01:02 | 40 | C*14:02 | 1 | |||
| C*04:82 | 1 | C*04:01 | 1 | ||||
| 28 | 20 | ||||||
| THC | A | A*02:15 N | 1 | A*02:07 | 1 | ||
| A*03:02 | 1 | A*03:01 | 1 | ||||
| A*11:02 | A*11:01 | 1 | |||||
| A*24:02 | 314 | A*26:01 | 1 | ||||
| A*24:08 | 1 | A*24:02 | 1 | ||||
| A*24:20 | 10 | A*24:02 | 1 | ||||
| A*26:01 | 67 | A*02:06 | 1 | A*02:06 | 1 | ||
| A*25:01 | 1 | ||||||
| A*33:03 | 4 | ||||||
| A*26:02 | 12 | A*24:02 | 2 | A*24:02 | 1 | ||
| A*26:03 | 22 | A*24:02 | 1 | ||||
| A*26:01 | 1 | ||||||
| A*66:01 | 1 | ||||||
| A*26:05 | 1 | A*26:01 | 1 | A*26:01 | 1 | ||
| B | B*07:169 | 1 | B*07:02 | 1 | B*07:02 | 1 | |
| B*15:01 | 71 | B*46:01 | 1 | ||||
| B*15:07 | 5 | B*15:01 | 1 | ||||
| B*15:11 | 5 | B*07:02 | 1 | ||||
| B*15:01 | 1 | ||||||
| B*40:02 | 1 | ||||||
| B*46:01 | 1 | ||||||
| B*15:27 | 1 | B*15:01 | 1 | ||||
| B*46:01 | 1 | ||||||
| B*15:28 | 1 | B*15:428 | 1 | B*15:428 | 1 | ||
| B*35:01 | 67 | B*39:01 | 1 | B*39:01 | 1 | ||
| B*40:02 | 57 | B*40:356 | 1 | ||||
| B*40:06 | 34 | B*40:01 | 1 | ||||
| B*40:02 | 20 | ||||||
| B*40:04 | 2 | ||||||
| B*44:03 | 1 | ||||||
| B*46:01 | 1 | ||||||
| B*40:52 | 1 | B*40:01 | 1 | ||||
| B*40:02 | 1 | ||||||
| B*51:01 | 70 | B*15:01 | 1 | B*15:01 | 1 | ||
| B*40:01 | 1 | ||||||
| B*51:02 | 1 | B*51:02 | 4 | ||||
| B*54:01 | 1 | ||||||
| B*52:01 | 80 | B*07:02 | 1 | ||||
| B*51:01 | 1 | ||||||
| B*51:02 | 1 | ||||||
| B*52:54 | 1 | ||||||
| B*54:01 | 64 | B*15:18 | 1 | ||||
| B*35:01 | 2 | ||||||
| B*40:02 | 2 | ||||||
| B*44:02 | 1 | ||||||
| B*44:03 | 3 | ||||||
| B*46:01 | 1 | ||||||
| B*55:01 | 7 | ||||||
| B*55:02 | 20 | B*35:01 | 1 | ||||
| B*40:01 | 1 | ||||||
| B*51:01 | 2 | ||||||
| B*54:01 | 1 | ||||||
| B*55:01 | 3 | ||||||
| B*67:01 | 1 | ||||||
| B*56:01 | 5 | B*55:01 | 4 | ||||
| B*67:01 | 11 | B*39:01 | 2 | ||||
| C | C*01:02 | 135 | C*08:01 | 1 | |||
| C*03:02 | 2 | C*03:04 | 2 | ||||
| C*03:03 | 111 | C*03:04 | 1 | C*03:04 | 1 | ||
| C*07:02 | 1 | C*07:02 | 1 | ||||
| C*04:82 | 10 | C*04:01 | 8 | ||||
| C*08:22 | 2 | C*08:01 | 1 | ||||
| 67 | C*07:02 | 1 | |||||
| C*12:02 | 1 | ||||||
| 49 | |||||||
Bold font highlights an example (mentioned in the text) for which including the ToMMo HLA panel allowed identification of the correct HLA type
Typing results between the Luminex method and NGS-based HLA typing for all inconsistent samples. Bold font corresponds to the typing results from NGS-based HLA typing
| HLA typing | HLA calling | |||||
|---|---|---|---|---|---|---|
| Dataset | Sample | NGS-baseda | Luminex | HLA-VBSeq | HLA-VBSeq v2 | HLA*PRG:LA |
| SJS | SJS01 | B*59:01 | B*59:01 | B*59:01 | ||
| SJS02 | C*07:01 | C*07:01 | ||||
| SJS03 | C*08:01 | C*08:01 | ||||
| SJS04 | C*04:01 | C*04:01 | C*04:01 | |||
| THC | THC01 | A*02:07 | A*02:07 | A*02:07 | ||
| THC02 | A*11:01 | A*11:01 | ||||
| THC03 | B*07:02 | B*07:02 | B*07:02 | B*07:02 | ||
| THC04 | C*08:01 | C*08:01 | ||||
| THC05 | C*08:01 | C*08:01 | C*08:01 | |||
| THC06 | C*04:01 | C*04:01 | C*04:01 | |||
| THC07 | C*04:01 | C*04:01 | ||||
| THC08 | C*04:01 | C*04:01 | C*04:01 | |||
| THC09 | C*04:01 | C*04:01 | C*04:01 | |||
| THC10 | C*04:01 | C*04:01 | C*04:01 | |||
| THC11 | C*04:01 | C*04:01 | ||||
| THC12 | C*04:01 | C*04:01 | C*04:01 | |||
| THC13 | C*04:01 | C*04:01 | C*04:01 | |||
| THC14 | C*04:01 | C*04:01 | C*04:01 | |||
| THC15 | C*04:01 | C*04:01 | C*04:01 | |||
aThese typing results were used as the “true” types