| Literature DB >> 28207879 |
Jason H Karnes1, Christian M Shaffer2, Lisa Bastarache3, Silvana Gaudieri4,5,6, Andrew M Glazer2, Heidi E Steiner1, Jonathan D Mosley2, Simon Mallal4,6,7, Joshua C Denny2,3, Elizabeth J Phillips2,4,6, Dan M Roden2,3,8.
Abstract
Imputation of human leukocyte antigen (HLA) alleles from SNP-level data is attractive due to importance of HLA alleles in human disease, widespread availability of genome-wide association study (GWAS) data, and expertise required for HLA sequencing. However, comprehensive evaluations of HLA imputations programs are limited. We compared HLA imputation results of HIBAG, SNP2HLA, and HLA*IMP:02 to sequenced HLA alleles in 3,265 samples from BioVU, a de-identified electronic health record database coupled to a DNA biorepository. We performed four-digit HLA sequencing for HLA-A, -B, -C, -DRB1, -DPB1, and -DQB1 using long-read 454 FLX sequencing. All samples were genotyped using both the Illumina HumanExome BeadChip platform and a GWAS platform. Call rates and concordance rates were compared by platform, frequency of allele, and race/ethnicity. Overall concordance rates were similar between programs in European Americans (EA) (0.975 [SNP2HLA]; 0.939 [HLA*IMP:02]; 0.976 [HIBAG]). SNP2HLA provided a significant advantage in terms of call rate and the number of alleles imputed. Concordance rates were lower overall for African Americans (AAs). These observations were consistent when accuracy was compared across HLA loci. All imputation programs performed similarly for low frequency HLA alleles. Higher concordance rates were observed when HLA alleles were imputed from GWAS platforms versus the HumanExome BeadChip, suggesting that high genomic coverage is preferred as input for HLA allelic imputation. These findings provide guidance on the best use of HLA imputation methods and elucidate their limitations.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28207879 PMCID: PMC5312875 DOI: 10.1371/journal.pone.0172444
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Principal components analysis of 1000 Genomes samples and study population.
Eigenvectors 1 and 2 are plotted to determine racial decent and admixture of European and African Americans in the BioVU population. EA indicates European American (BioVU); AA, African American (BioVU); CEPH, 1000 Genomes Utah Residents; ASW, Americans of African Ancestry in Southwestern USA; MKK, Maasai in Kinyawa, Kenya; CHB, Han Chinese in Bejing, China; JPT, Japanese in Tokyo, Japan; LWK, Luhya in Webuye, Kenya; YRI, Yoruba in Ibadan, Nigeria.
HLA imputation programs evaluation for all HLA alleles.
| Race/Ethnicity | Imputation Program | Concordance Rate | Call Rate | Predicted Alleles (n) |
|---|---|---|---|---|
| European Americans (n = 2,947) | SNP2HLA | 0.975 | 1.00 | 210 |
| HLA*IMP:02 | 0.939 | 0.985 | 140 | |
| HIBAG | 0.976 | 0.978 | 175 | |
| African Americans (n = 318) | SNP2HLA | 0.919 | 0.999 | 174 |
| HLA*IMP:02 | 0.619 | 0.768 | 134 | |
| HIBAG | 0.929 | 0.584 | 131 |
Concordance and call rates generated from imputed alleles with posterior probability>0.50 versus sequenced alleles after combining data for HumanOmni1-QUAD and HumanOmni5-QUAD platforms by race/ethnicity.
1) Based on sequencing, 325 distinct four digit alleles were present in the European American population.
2) Based on sequencing, 219 distinct four digit alleles were present in the African American population.
Concordance rate and call rate for each imputation program.
| European Americans | African Americans | ||||
|---|---|---|---|---|---|
| Allele | Imputation Program | Concordance Rate | Call Rate | Concordance Rate | Call Rate |
| SNP2HLA | 0.983 | 0.999 | 0.969 | 0.995 | |
| HLA*IMP:02 | 0.963 | 0.997 | 0.675 | 0.855 | |
| HIBAG | 0.986 | 0.996 | 0.960 | 0.796 | |
| SNP2HLA | 0.969 | 1.00 | 0.884 | 1.00 | |
| HLA*IMP:02 | 0.952 | 0.979 | 0.423 | 0.752 | |
| HIBAG | 0.978 | 0.967 | 0.953 | 0.403 | |
| SNP2HLA | 0.987 | 1.00 | 0.884 | 1.00 | |
| HLA*IMP:02 | 0.984 | 0.994 | 0.792 | 0.741 | |
| HIBAG | 0.987 | 0.992 | 0.957 | 0.619 | |
| SNP2HLA | 0.957 | 1.00 | 0.945 | 1.00 | |
| HLA*IMP:02 | 0.829 | 0.987 | 0.567 | 0.708 | |
| HIBAG | 0.957 | 0.975 | 0.834 | 0.475 | |
| SNP2HLA | 0.988 | 1.00 | 0.907 | 1.00 | |
| HLA*IMP:02 | 0.983 | 0.993 | 0.845 | 0.761 | |
| HIBAG | 0.988 | 0.990 | 0.904 | 0.654 | |
| SNP2HLA | 0.964 | 1.00 | 0.920 | 1.00 | |
| HLA*IMP:02 | 0.924 | 0.961 | 0.414 | 0.791 | |
| HIBAG | 0.959 | 0.946 | 0.946 | 0.557 | |
Concordance and call rates generated from imputed alleles with posterior probability>0.50 versus sequenced alleles after combining data for HumanOmni1-QUAD and HumanOmni5-QUAD platforms by HLA locus and race/ethnicity.
Concordance rates and call rates for imputation programs for all HLA loci by platform and allele frequency in European Americans.
| SNP2HLA | HLA*IMP:02 | HIBAG | ||
|---|---|---|---|---|
| Platform | HumanExome BeadChip | .969/.999 | .892/.950 | .976/.973 |
| HumanOmni1-QUAD | .975/1.00 | .939/.985 | .976/.978 | |
| HumanOmni5-QUAD | .975/1.00 | .938/.985 | .975/.977 | |
| HumanOmni1-QUAD / HumanOmni5-QUAD | .976/1.00 | .942/.986 | .979/.979 | |
| HLA allele Frequency | Freq.<0.05 | .981/- | .951/- | .975/- |
| Freq.<0.01 | .979/- | .945/- | .971/- |
Freq. indicates frequency cutoff; HLA, human leukocyte antigen.
1) Call rates not estimated when frequency cutoffs were implemented
Fig 2Allele frequency versus concordance rates of HLA alleles by imputation program in European Americans.
Concordance rates were generated using OMNI1 and OMNI5 combined SNP-level data and posterior probability >0.50 for each imputation program.
Fig 3Allele frequency versus concordance rates of HLA alleles by imputation program in African Americans.
Concordance rates were generated using OMNI1 and OMNI5 combined SNP-level data and posterior probability>0.50 for each imputation program.
Concordance rate and call rate for important disease-associated and adverse drug reaction-associated alleles.
| HLA Allele | Disease/ADR | Imputation Program | Concordance Rate (EAs) | Concordance Rate (AAs) |
|---|---|---|---|---|
| ankylosing spondylitis[ | SNP2HLA | 0.948 | 0.667 | |
| HLA*IMP:02 | 0.936 | 0.250 | ||
| HIBAG | 0.933 | 1.000 | ||
| abacavir HSN[ | SNP2HLA | 0.996 | 1.000 | |
| HLA*IMP:02 | 0.978 | 0.118 | ||
| HIBAG | 0.975 | 1.000 | ||
| allopurinol SJS/TEN[ | SNP2HLA | 1.000 | 0.857 | |
| HLA*IMP:02 | 1.000 | 0.621 | ||
| HIBAG | 0.964 | 0.783 | ||
| Sjogren’s Syndrome[ | SNP2HLA | 0.980 | 0.518 | |
| HLA*IMP:02 | 0.997 | 0.957 | ||
| HIBAG | 0.997 | 0.698 | ||
| Lupus erythematosus[ | SNP2HLA | 0.750 | 1.000 | |
| HLA*IMP:02 | - | - | ||
| HIBAG | 1.000 | 0.951 | ||
| primary biliary cirrhosis[ | SNP2HLA | 0.978 | 1.000 | |
| HLA*IMP:02 | 0.882 | 0.154 | ||
| HIBAG | 0.951 | - | ||
| rheumatoid arthritis[ | SNP2HLA | 0.951 | 0.889 | |
| HLA*IMP:02 | 0.856 | 0.179 | ||
| HIBAG | 0.927 | 0.158 |
Concordance rates were generated using HumanOmni1-QUAD and HumanOmni5-QUAD combined SNP-level data and posterior probability>0.50 for each imputation program by HLA locus and race/ethnicity. HLA indicates human leukocyte antigen; EA, European American; AA, African American; HSN, hypersensitivity; DILI, drug-induced liver injury; NA, not applicable; SJS, Stevens-Johnson Syndrome; TEN, toxic epidermal necrosis
1) “-”indicates that the imputation program did not impute the allele.