| Literature DB >> 29940840 |
Maria Luisa Matey-Hernandez1,2, Søren Brunak1,3, Jose M G Izarzugaza4.
Abstract
BACKGROUND: The adaptive immune response intrinsically depends on hypervariable human leukocyte antigen (HLA) genes. Concomitantly, correct HLA phenotyping is crucial for successful donor-patient matching in organ transplantation. The cost and technical limitations of current laboratory techniques, together with advances in next-generation sequencing (NGS) methodologies, have increased the need for precise computational typing methods.Entities:
Keywords: Clinical genomics; HLA genotyping; NGS; Population genetics; Prediction
Mesh:
Substances:
Year: 2018 PMID: 29940840 PMCID: PMC6019707 DOI: 10.1186/s12859-018-2239-6
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Descent Accuracy (DA) for the two typers considered
| Typer | Overall | HLA-A | HLA-B | HLA-C |
|---|---|---|---|---|
| Polysolver (4-digit) | 0.95 | 0.95 | 0.95 | 0.96 |
| Optitype (4-digit) | 0.88 | 0.95 | 0.82 | 0.87 |
| Polysolver (8-digit) | 0.64 | 0.47 | 0.68 | 0.77 |
Optitype at 4-digit resolution performed better than Polysolver having 8-digit resolution. However, when allele reduction is applied Polysolver surpasses the results provided by Optitype on the Genome Denmark cohort
Method agreement (MA) across the different loci and overall
| Overall | HLA-A | HLA-B | HLA-C | |
|---|---|---|---|---|
| MATOTAL | 0.63 | 0.65 | 0.60 | 0.62 |
| MAT | 0.62 | 0.63 | 0.60 | 0.64 |
| MANT | 0.63 | 0.67 | 0.60 | 0.60 |
MA represents the fraction of coherent alleles between Optitype and Polysolver at 4-digit resolution. MATOTAL refers to the complete set of alleles. MAT refers to the portion of alleles that are inherited from parent to child, and MANT to those those that are not inherited and therefore, not part of the DA calculation
Fig. 1Concordance between the two methods at the level of individuals. The y-axis indicates the number of individuals, while the x-axis shows the number of alleles per individual identically typed for Optitype and Polysolver 4-D)
Fig. 2Blastn results of Optitype sequences against Optitype database (l) and Polysolver sequences against Polysolver database (r). In the plots we can observe that the identity within Optitype is higher than within Polysolver. This stems from the nature of the database. Optitype relies on a database with exons 2 and 3 and reconstructed introns, which produces sequences with scarce variation. As expected, Polysolver, due to including genomic sequences, has more variance in the identity within sequences. The self-blasted results (i.e. Sequence A against itself) were removed from the analysis
Homozygosity Rates between methods
| Typer | Overall | HLA-A | HLA-B | HLA-C |
|---|---|---|---|---|
| Optitype (4-digit) | 0.09 | 0.08 | 0.08 | 0.11 |
| Polysolver (8-digit) | 0.08 | 0.08 | 0.02 | 0.14 |
| Polysolver (4-digit) | 0.12 | 0.12 | 0.09 | 0.15 |
Homozygosity Rates between methods, based on the number of identical alleles in each locus, either HLA-A, HLA-B or HLA-C; or across all three (Overall)
Comparison of allele frequencies between different populations
| Genome Denmark | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| Allele | Northern Ireland | Sweden (South) | Sweden (North) | Germany | England (North) | Basque Country | Scotland Orkney | Polysolver | Optitype |
| A*02:01 | 0.27 | 0.26 | 0.24 | 0.28 | 0.29 | 0.27 | NA | [#1] 0.26 | [#1] 0.24 |
| A*01:01 | 0.20 | 0.09 | 0.08 | 0.15 | 0.21 | NA | NA | [#2] 0.21 | [#2] 0.22 |
| A*03:01 | 0.14 | 0.25 | 0.31 | 0.15 | 0.14 | NA | NA | [#3] 0.17 | [#3] 0.17 |
| A*24:02 | NA | 0.13 | 0.21 | 0.09 | 0.07 | NA | NA | [#4] 0.09 | [#5] 0.04 |
| A*11:01 | 0.08 | 0.06 | 0.01 | 0.06 | 0.07 | NA | NA | [#5] 0.04 | < 0.04 |
| A*23:01 | 0.01 | NA | NA | 0.023 | 0.02 | 0.02 | NA | < 0.045 | [#4] 0.05 |
| B*07:02 | 0.17 | 0.19 | 0.19 | 0.12 | 0.15 | NA | NA | [#1] 0.18 | [#1] 0.18 |
| B*07:05 | 10−3 | NA | NA | 4 × 10−3 | 3 × 10−3 | NA | NA | [#2] 0.15 | < 0.06 |
| B*15:01 | 0.04 | 0.14 | 0.15 | 0.06 | 0.06 | NA | NA | [#3] 0.07 | [#4] 0.09 |
| B*44:02 | 0.13 | 0.10 | 0.03 | 0.07 | 0.10 | NA | 0.26 | [#4] 0.07 | [#3] 0.09 |
| B*40:01 | 0.05 | 0.10 | 0.14 | 0.05 | 0.06 | NA | 0.06 | [#5] 0.06 | [#5] 0.06 |
| B*08:01 | 0.16 | 0.07 | 0.05 | 0.09 | 0.15 | NA | 0.17 | < 0.06 | [#2] 0.13 |
| C*07:01 | NA | NS | NS | 0.15 | 0.19 | NA | NA | [#1] 0.19 | [#1]0.2 |
| C*07:02 | 0.19 | NS | NS | 0.13 | 0.16 | NA | NA | [#2] 0.18 | [#2] 0.17 |
| C*06:02 | 0.09 | NS | NS | 0.1 | 0.09 | 0.03 | 0.07 | [#3] 0.11 | [#4] 0.1 |
| C*03:03 | 0.05 | NS | NS | 0.05 | 0.06 | 0.07 | 0.09 | < 0.08 | [#3] 0.11 |
| C*03:04 | 0.06 | NS | NS | 0.07 | 0.08 | 0.05 | 0.05 | [#4] 0.09 | [#5] < 0.09 |
| C*05:01 | 0.13 | NS | NS | 0.06 | 0.10 | 0.14 | 5 × 10−3 | [#5] 0.08 | 0.09 |
Comparison of allele frequencies between different historically related populations through settlements (Northern Ireland, England, Scotland Orkney), geo-graphically nearness (Sweden, Germany) and not related (Basque Country). The “NA” value means that the particular allele is either not present in the population or not significant. “NS” means there is no data for this allele in the corresponding study. The top five most frequent alleles for each loci per method are included for the Genome Demark cohort. The alleles marked with “#” indicate the order of said allele in the ranking of the most common alleles
Fig. 3Distribution of alleles according to CWD for Polysolver (a) and Optitype (b). These results highlight that HLA- B harbours the rarest alleles