| Literature DB >> 21172045 |
Minzhu Xie1, Jing Li, Tao Jiang.
Abstract
BACKGROUND: The human leukocyte antigen system (HLA) contains many highly variable genes. HLA genes play an important role in the human immune system, and HLA gene matching is crucial for the success of human organ transplantations. Numerous studies have demonstrated that variation in HLA genes is associated with many autoimmune, inflammatory and infectious diseases. However, typing HLA genes by serology or PCR is time consuming and expensive, which limits large-scale studies involving HLA genes. Since it is much easier and cheaper to obtain single nucleotide polymorphism (SNP) genotype data, accurate computational algorithms to infer HLA gene types from SNP genotype data are in need. To infer HLA types from SNP genotypes, the first step is to infer SNP haplotypes from genotypes. However, for the same SNP genotype data set, the haplotype configurations inferred by different methods are usually inconsistent, and it is often difficult to decide which one is true.Entities:
Mesh:
Substances:
Year: 2010 PMID: 21172045 PMCID: PMC3024871 DOI: 10.1186/1471-2105-11-S11-S10
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1HLA region of chromosome 6
Figure 2HLA nomenclature
Figure 3A sketch of WSG-HI
Figure 4Comparison with IBD-HI Comparison of the algorithm of Setty et al. (labeled as IBD-HI) and our algorithm (labeled as WSG-HI) a both 4-digit (a) and 2-digit (b) resolution levels. The accuracy of IBD-HI is obtained directly from [2], an all results are based on the leave-one-out test using genotype data from the 200kb regions centered around each HLA gene.
Figure 5The accuracy of GSW-HI When the size of the genotype region used by the algorithm changes, the accuracy of GSW-HI varies slightly with the exception of HLA-DRB1.
Experimental results of the leave-one-pedigree-out test using genotype data from the 200k region centered around each HLA gene.
| Gene | 4-digit | 2-digit | ||
|---|---|---|---|---|
| HLA-A | 100 | 95.6 | 100 | 95.3 |
| HLA-B | 100 | 93.2 | 100 | 93.0 |
| HLA-C | 100 | 95.6 | 100 | 95.3 |
| HLA-DRB1 | 100 | 80.8 | 100 | 92.4 |
| HLA-DQA1 | 100 | 93.5 | 100 | 94.9 |
| HLA-DQB1 | 100 | 94.0 | 100 | 94.2 |
Figure 6An example of the HLA gene type inference problem
Experimental results when T varies
| 1 | 2 | 3 | |
|---|---|---|---|
| HLA-A | 95.89 | 96.50 | 96.50 |
| HLA-B | 93.57 | 94.31 | 94.26 |
| HLA-C | 94.82 | 96.65 | 96.34 |
| HLA-DRB1 | 84.19 | 83.87 | 84.52 |
| HLA-DQA1 | 98.00 | 98.29 | 98.29 |
| HLA-DQB1 | 97.14 | 97.43 | 97.71 |
Accuracy of WSG-HI in the leave-one-out test using genotype data from the 200kb region centered around each HLA gene when the threshold T varies from 1 to 3 and T = 0.65.
Figure 7An example of a weighted similarity graph and one of its feasible form (a): An example of a weighted similarity graph. A solid line denotes a similarity edge and a dashed arc denotes a constraint edge. (b): A feasible form of (a)
Experimental results when T varies
| 0.55 | 0.60 | 0.65 | 0.70 | 0.75 | 0.80 | 0.85 | 0.90 | |
|---|---|---|---|---|---|---|---|---|
| HLA-A | 95.89 | 95.89 | 96.50 | 95.89 | 95.89 | 95.89 | 95.89 | 95.89 |
| HLA-B | 95.00 | 94.64 | 94.31 | 94.64 | 94.64 | 94.64 | 94.64 | 94.64 |
| HLA-C | 96.34 | 96.34 | 96.65 | 95.73 | 96.04 | 95.43 | 95.12 | 93.29 |
| HLA-DRB1 | 83.87 | 83.87 | 83.87 | 83.87 | 83.87 | 83.87 | 83.87 | 83.87 |
| HLA-DQA1 | 98.29 | 98.29 | 98.29 | 97.71 | 98.00 | 97.71 | 97.71 | 96.57 |
| HLA-DQB1 | 97.14 | 97.43 | 97.43 | 97.14 | 97.14 | 96.86 | 97.14 | 96.86 |
Accuracy of WSG-HI in the leave-one-out test using genotype data from the 200kb region centered at each HLA gene when the threshold T varies from 0.55 to 0.90 and T = 2.