| Literature DB >> 34925459 |
Venceslas Douillard1, Erick C Castelli2, Steven J Mack3, Jill A Hollenbach4,5, Pierre-Antoine Gourraud1, Nicolas Vince1, Sophie Limou1,6.
Abstract
The current SARS-CoV-2 pandemic era launched an immediate and broad response of the research community with studies both about the virus and host genetics. Research in genetics investigated HLA association with COVID-19 based on in silico, population, and individual data. However, they were conducted with variable scale and success; convincing results were mostly obtained with broader whole-genome association studies. Here, we propose a technical review of HLA analysis, including basic HLA knowledge as well as available tools and advice. We notably describe recent algorithms to infer and call HLA genotypes from GWAS SNPs and NGS data, respectively, which opens the possibility to investigate HLA from large datasets without a specific initial focus on this region. We thus hope this overview will empower geneticists who were unfamiliar with HLA to run MHC-focused analyses following the footsteps of the Covid-19|HLA & Immunogenetics Consortium.Entities:
Keywords: HLA; Major Histocompatibility Complex (MHC); association analysis; immunogenetics; imputation
Year: 2021 PMID: 34925459 PMCID: PMC8677840 DOI: 10.3389/fgene.2021.774916
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
FIGURE 1History and development of HLA nomenclature as illustrated by HLA-A*02:01:01:134Q. Each level of resolution corresponds to a group of HLA alleles fitting the description, except for the full DNA sequence, a unique HLA allele. Colored pins represent non-synonymous polymorphism (pink) and synonymous or intronic polymorphisms (purple); the displayed polymorphism is only indicative and does not reflect HLA-A*02:01:01:134Q sequence. P and G groups are named with the lowest numbered two-field (HLA-A*02:01) and three-field (HLA-A*02:01:01) HLA allele name, respectively. Class II P and G groups are based on exon 2 only, while class I P and G groups are based on exons 2 and 3. Supertypes are not defined as part of the official nomenclature (Wang and Claesson, 2014; del Guercio et al., 1995; Sidney et al., 1995). Created with biorender.com.
Tools for HLA analyses.
| HLA application name | Description | URL |
|---|---|---|
| Alphlard-nt ( | Identification of somatic mutations in HLA molecules from whole-genome and exome data using Bayesian algorithms | — |
| BIGDAWG ( | Open-source R package for the case-control analysis of highly polymorphic data at the allele, haplotype and amino-acid level |
|
| Easy-HLA ( | Website with HLA alleles haplotyping, upgrading and inference from HLA genotypes, prediction of HLA-C expression |
|
| HATK ( | Open-source |
|
| HLA-check ( | Perl tool evaluating the probability of accurate HLA genotype imputation by comparing it to SNP imputation in the exonic region of HLA. |
|
| HLA-EMMA ( | Donor/recipient compatibility assessment based on solvent-accesible amino acids, based on intralocus comparisons |
|
| HLA | Open-source R pipeline for HLA association studies. Performing SNP quality control steps, stratification, HLA imputation and representation of the results |
|
| HLAHapV ( | A Java-based HLA Haplotype Validator for quality assessments of HLA typing |
|
| HLA-NET ( | Set of tools to manipulate HLA data, infer haplotypes, convert files format, and information about typing |
|
| HLApers ( | Genotyping and quantification of HLA expression from RNA-seq data |
|
| HLA-TAPAS ( | Open-source |
|
| MergeReference ( | SNP2HLA compatible tool to concatenate multiple reference panels in order to gain accuracy during HLA imputation |
|
| pyHLA ( | Association analysis for HLA alleles in |
|
FIGURE 2HLA imputation from GWAS data. Reference panels are created from individuals with known SNP and HLA data. Depending on the method, an algorithm will deduce the probability of a specific HLA allele in the population given a SNP haplotype. These new found links are stored for that reference panel and applied to new SNP data to infer HLA genotypes. HLA-A is given as an example with a truncated list of alleles; other MHC genes are imputed using the same method. Different populations are represented in different circles and imply different allele frequencies. Pinpoints represent SNPs and are only indicative. HLA imputation results are highly dependent on the population chosen for the reference panel. Created with biorender.com.
FIGURE 3Association study pipeline for HLA data and surrounding immunogenetic factors. Created with biorender.com.