| Literature DB >> 26311539 |
Kazuyoshi Hosomichi1, Takashi Shiina2, Atsushi Tajima1, Ituro Inoue3.
Abstract
In the past decade, the development of next-generation sequencing (NGS) has paved the way for whole-genome analysis in individuals. Research on the human leukocyte antigen (HLA), an extensively studied molecule involved in immunity, has benefitted from NGS technologies. The HLA region, a 3.6-Mb segment of the human genome at 6p21, has been associated with more than 100 different diseases, primarily autoimmune diseases. Recently, the HLA region has received much attention because severe adverse effects of various drugs are associated with particular HLA alleles. Owing to the complex nature of the HLA genes, classical direct sequencing methods cannot comprehensively elucidate the genomic makeup of HLA genes. Thus far, several high-throughput HLA-typing methods using NGS have been developed. In HLA research, NGS facilitates complete HLA sequencing and is expected to improve our understanding of the mechanisms through which HLA genes are modulated, including transcription, regulation of gene expression and epigenetics. Most importantly, NGS may also permit the analysis of HLA-omics. In this review, we summarize the impact of NGS on HLA research, with a focus on the potential for clinical applications.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26311539 PMCID: PMC4660052 DOI: 10.1038/jhg.2015.102
Source DB: PubMed Journal: J Hum Genet ISSN: 1434-5161 Impact factor: 3.172
Figure 1HLA typing to provide sequencing data for the HLA gene(s) and regions. The HLA sequencing data of NGS could be analyzed from various points of view. The minimum scope of polymorphisms is the genotype of an SNV, and the maximum scope is the HLA haplotype sequence as a set of alleles from each HLA gene. The phase-determined sequence of the HLA allele can be applied for HLA typing as a reference. The resolution of HLA typing is classified into the following four categories: two-digit for alleles, four-digit for specific HLA proteins, six-digit for specific HLA coding sequence (CDS) and eight-digit as specific HLA genome sequences including untranslated regions and introns.
PCR-based HLA typing using NGS
| 2009 | Gabriel C | Human Immunology | PCR for each exon | Exons 1, 2, 3 and 4 | GS FLX (Roche) | AVA (Roche) Assign SBT(Conexio Genomics) | |
| 2009 | Bentley G | Tissue Antigens | PCR for each exon | Exons 2, 3 and 4 of A, B and C; exon 2 of DRB1, DPB1, DQA1; exons 2 and 3 of DQB1 | GS FLX (Roche) | HLA typing software (Conexio Genomics) | |
| 2010 | Lind C | Human Immunology | Long PCR | Entire gene of A, B and C exons 2–3 of DRB1 and DQB1 | GS FLX (Roche) | Assign MPS software (Conexio Genomics) | |
| 2010 | Lank SM | Human Immunology | RT-PCR | Exons 2, 3 and 4 of A, B and C | GS FLX (Roche) | BLAT | |
| 2011 | Erlich | BMC Genomics | PCR | Exons 2, 3 and 4 of A, B and C | GS FLX (Roche) | GATK | |
| 2011 | Holcomb CL | Tissue Antigens | PCR for each exon | Exons 2, 3 and 4 of A, B and C; exon 2 of DRB1, DRB3/4/5, DPB1, DQA1; exons 2 and 3 of DQB1 | GS FLX (Roche) | Assign ATF (Conexio Genomics) | |
| 2012 | Wang C | Proc Natl Acad Sci U S A. | Long PCR | Exons 1 -7 of A, B and C exons 2 -5 of DRB1 | GAIIx (Illumina) HiSeq2000 (Illumina) MiSeq (Illumina) | BLASTN | |
| 2012 | Shiina T | Tissue Antigens | Long PCR | Entire gene (2 amplicons for DRB1and DPB1) | GS Junior (Roche) ionPGM (Thermo) | BLAT Sequencher (GeneCodes) | |
| 2012 | Lank SM | BMC Genomics | RT-PCR | Exons 1 -7 of HLA class I (two amplicons) exons 1 -4 of HLA class II | GS Junior (Roche) | BLAT | |
| 2013 | Moonsamy PV | Tissue Antigens | PCR | Exons 2 and 3 of A, B and C exon 2 of DRB1, DRB3/4/5, DQB1 | GS FLX (Roche) | Assign ATF 454 (Conexio Genomics) | |
| 2013 | Ringquist S | PLoS One | PCR | Exon 2 | GS FLX (Roche) | CAPSeq (Original) | |
| 2013 | Danzer M | BMC Genomics | PCR for each exon | Exons 2, 3 and 4 of A exons 1, 2, 3 and 4 of B; exons 1, 2, 3, 4 and 7 of C; exon 2 and 3 of DRB1, DRB3/4/5, DQB1; exon 2 of DPB1 | GS Junior (Roche) | Assign ATF (Conexio Genomics) | |
| 2013 | Hosomichi K | BMC Genomics | Long PCR | Entire gene | MiSeq (Illumina) | Phase-defined sequencing (Original) | |
| 2013 | Trachtenberg EA | Methods Mol Biol | PCR for each exon | Exons 2, 3 and 4 of A, B and C; exon 2 of DRB1, DRB3/4/5, DPB1, DQA1; exons 2 and 3 of DQB1 | GS FLX (Roche) | Assign ATF (Conexio Genomics) | |
| 2014 | Ozaki Y | Tissue Antigens | Long PCR | Exons 2–6 of DRB1, DRB3/4/5 | GS Junior (Roche) | BLAT Sequencher (GeneCodes) | |
| 2014 | Hajeer AH | Tissue Antigens | PCR | Exons 2 and 3 of A, B and C exon 2 of DRB1 exons 2 and 3 of DQB1 | GS FLX (Roche) GS Junior (Roche) | Assign ATF 454 (Conexio Genomics) | |
| 2014 | Hosomichi K | BMC Genomics | Long PCR | entire gene | MiSeq (Illumina) | Phase-defined sequencing (Original) | |
| 2014 | Smith AG | Human Immunology | PCR for each exon | Exons 2 and 3 of DQA1, DQB1, DRB1, DRB3/4/5 exon 2 of DPA1 and DPB1 | MiSeq (Illumina) | GeMS (Scisco Genetics) | |
| 2014 | Ehrenberg PK | BMC Genomics | Long PCR | Entire gene | MiSeq (Illumina) | Omixon target (Omixon) | |
| 2014 | Zhou M | Tissue Antigens | PCR | Exons 1–7 of A, B and C (4 amplicons); exon 2 of DRB1; exons 2 and 3 of DQB1 | HiSeq2000 (Illumina) | BGI computing procedure (Original) | |
| 2015 | Lan JH | Human Immunology | Long PCR | Entire gene (2 amplicons for DRB1) | MiSeq (Illumina) | NGSengine (Gen Dx) | |
| 2015 | Ozaki Y | BMC Genomics | Long PCR | Entire genes of A, B and C exons 2–4 of DRB1, DRB3/4/5, DQA1, DQB1 exons 2–6 of DPB1 | ionPGM (Thermo) | SeaBass (Original, In-house) |
Figure 2Preparation of HLA gene fragments for the DNA library. DNA fragments of the HLA genes are prepared by PCR-based (a) or hybridization-based (b) methods. (a) Many publications describing PCR-based methods have used different PCR designs such as short PCR for target exons (blue bar) or long PCR for entire genes (red bar). After PCR amplification, each of the pooled PCR products is applied for library preparation with/without fragmentation to add adapters with/without indexes for each sequencer. In the PCR-based method, the first step is PCR for HLA genes and the second step is library preparation. (b) The sequence capture method based on hybridization is also commonly used to enrich HLA gene fragments. DNA/RNA probes with the HLA gene sequence are hybridized to the DNA library, which includes the HLA gene sequence. The biotinylated probes-bound DNA libraries are collected using a magnet and streptavidin magnetic beads. In the sequence capture method, the first step is library preparation and the second step is enrichment for HLA genes. (c) After sequencing, HLA gene sequences of each individual are reconstructed by alignment to reference HLA gene sequences. The consensus sequences constructed by the aligned reads are searched for specific HLA alleles in the IMGT/HLA database. In the NGS-based HLA-typing method, the basic data analysis approach is similar between PCR-based and sequence capture methods.
HLA-typing software and category of acceptable reads
| HLAminer | WGS/WES/RNA-seq/amplicon | Warren RL | |
| seq2HLA | RNA-seq | Boegel S | |
| ATHLATES | WGS/WES/amplicon | Liu C | |
| OptiType | WGS/WES/RNA-seq | Szolek A | |
| HLAforest | RNA-seq | Kim HJ | |
| PHLAT | WGS/WES/RNA-seq | Bai Y | |
| Phase-defined HLA sequencing | Amplicon | Hosomichi K | |
| HLAreporter | WGS/WES | Huang Y | |
| HLA-VBSeq | WGS | Nariai N | |
| HLAssign | Sequence capture | Wittig M | |
| Assign ATF (Conexio Genomics) | Amplicon | — | |
| Omixon Target HLA (Omixon) | WGS/WES/amplicon | Major E | |
| NGSengine (Gen Dx) | Amplicon | — | |
| GeMS (Scisco Genetics) | Amplicon | — |
Figure 3Overview of data analysis for HLA typing using WGS/WES. Massive sequence reads from WGS/WES are aligned to the whole IMGT/HLA database (all known HLA alleles) to search for best matching alleles based on alignment statistics, number of reads covering exons and the extent of exon coverage. The HLA allele can be identified by only storing reads that are mappable as homologous to any allele in the IMGT/HLA database with a low number of mismatches by statistical analysis.
Figure 4Example of target resequencing to detect variants and functional prediction of the regulatory region. Target resequencing of the HLA region clarifies all variants in the target region. For example, several variants and approximately 2-kb deletions have been detected in the upstream region of HLA-DRA. (a) Alignment view of mapped reads (pink: forward strand read, purple: reverse strand read) in the alignment track for detection of SNVs (A: green, C: blue, G: yellow, T: red) and the deletion as displayed in the coverage track. (b) The region was located in cis-regulatory elements as active (H3K27ac-marked) enhancers and a DNase I-hypersensitive site defined by ENCODE chromatin immunoprecipitation sequencing and DNaseI-seq peaks. The deletion and SNVs may affect the expression level of HLA-DRA by influencing the binding of TFs.
Number of eQTL SNVs linking expression level of HLA genes
| 821 | |
| 12 | |
| 773 | |
| 288 | |
| 580 | |
| 544 | |
| 473 | |
| 2 | |
| 126 | |
| Total | 3619 |
Abbreviations: eQTL, expression Quantitative Trait Loci; HLA, human leukocyte antigen; SNV, single-nucleotide variant.
Lead SNVs linking rheumatoid arthritis association with regulatory information in the human genome
| 6 | 29 789 171 | rs1610677 | 23 582 bp | Intergenic region | No regulatory annotation | |
| 6 | 31 379 931 | rs1063635 | 9903 bp | Coding region | 6 | Motif |
| 6 | 32 218 989 | rs9296015 | 27 283 bp | Intergenic region | 6 | Motif |
| 6 | 32 282 854 | rs6910071 | 49 238 bp | Intron | 6 | Motif |
| 6 | 32 429 643 | rs9268853 | 68 359 bp | Intergenic region | 6 | Motif |
| 6 | 32 574 171 | rs615672 | 35 847 bp | Intergenic region | 6 | Motif |
| 6 | 32 577 380 | rs660895 | 32 638 bp | Intergenic region | 4 | ChIP-seq peak + DNaseI-seq peak |
| 6 | 32 602 269 | rs9272219 | 44 749 bp | Intergenic region | 6 | Motif |
| 6 | 32 663 851 | rs6457617 | 27 409 bp | Intergenic region | 6 | Motif |
| 6 | 32 663 999 | rs6457620 | 27 557 bp | Intergenic region | 6 | Motif |
| 6 | 32 671 103 | rs13192471 | 34 661 bp | Intergenic region | No regulatory annotation | |
| 6 | 32 680 928 | rs7765379 | 44 486 bp | Intergenic region | No regulatory annotation |
Abbreviations: SNP, single nucleotide polymorphism; SNV, single-nucleotide variant; TSS, transcription start site.