Literature DB >> 28600441

Unique Allelic eQTL Clusters in Human MHC Haplotypes.

Tze Hau Lam1, Meixin Shen1, Matthew Zirui Tay2, Ee Chee Ren3,4.   

Abstract

The control of gene regulation within the major histocompatibility complex (MHC) remains poorly understood, despite several expression quantitative trait loci (eQTL) studies revealing an association of MHC gene expression with independent tag-single nucleotide polymorphisms (SNPs). MHC haplotype variation may exert a greater effect on gene expression phenotype than specific single variants. To explore the effect of MHC haplotype sequence diversity on gene expression phenotypes across the MHC, we examined the MHC transcriptomic landscape at haplotype-specific resolution for three prominent MHC haplotypes (A2-B46-DR9, A33-B58-DR3, and A1-B8-DR3) derived from MHC-homozygous B-lymphoblastoid cell lines (B-LCLs). We demonstrate that MHC-wide gene expression patterns are dictated by underlying haplotypes, and identify 36 differentially expressed genes. By mapping these haplotype sequence variations to known eQTL, we provide evidence that unique allelic combinations of eQTL, embedded within haplotypes, are correlated with the level of expression of 17 genes. Interestingly, the influence of haplotype sequence on gene expression is not homogenous across the MHC. We show that haplotype sequence polymorphisms within or proximate to HLA-A, HLA-C, C4A, and HLA-DRB regions exert haplotype-specific gene regulatory effects, whereas the expression of genes in other parts of the MHC region are not affected by the haplotype sequence. Overall, we demonstrate that MHC haplotype sequence diversity can impact phenotypic outcome via the alteration of transcriptional variability, indicating that a haplotype-based approach is fundamental for the assessment of trait associations in the MHC.
Copyright © 2017 Lam et al.

Entities:  

Keywords:  MHC; MHC sequence polymorphism; eQTLs; haplotypes

Mesh:

Year:  2017        PMID: 28600441      PMCID: PMC5555465          DOI: 10.1534/g3.117.043828

Source DB:  PubMed          Journal:  G3 (Bethesda)        ISSN: 2160-1836            Impact factor:   3.154


Genome-wide association studies (GWAS) have identified thousands of SNPs that confer association with >1600 observable phenotypes, complex traits, and diseases (Welter ). As a result of the massive number of possible individual SNPs in the human genome, tag-SNPs, which are representative of a short sequence block or haplotype held together by linkage disequilibrium, are used to define a genomic block (International HapMap Consortium 2003). Notably, the majority (>90%) of these tag-SNPs are found in noncoding regions of the human genome (Maurano ), presenting a major challenge for our understanding of the mechanistic effects of these variants on common traits. eQTL mapping of DNA sequence polymorphisms that modify transcript abundance of gene(s) has since emerged as an effective approach to reveal the implication of trait-associated variants on observable phenotypes (Albert and Kruglyak 2015). This approach has provided compelling evidence for eQTL causality in numerous traits, such as neurological disorders, autoimmune disorders, and cancers (Fairfax ; Li ; Ongen ; Raj ; Ramasamy ). The MHC region is one of the most prominent and intensely studied regions in the entire human genome. The genomic architecture of the MHC region displays high gene density, extreme sequence polymorphism, and high linkage disequilibrium (Horton ). More importantly, GWAS have consistently implicated tag-SNPs located within the MHC region with a range of traits (Trowsdale and Knight 2013). Although several eQTL studies have revealed the control of MHC gene expression by single independent tag-SNPs (Vandiedonck ; de Jong ; Fairfax ), functional variants implicating trait risks within the MHC have not been fully determined. Tag-SNPs are described as being embedded within specific haplotypes, which include functional variants that confer trait susceptibility (Balding 2006; Frazer ). However, the highly polymorphic nature of the MHC coupled with extensive linkage disequilibrium gives rise to vast haplotype diversity of varying length that cannot be fully reflected by tag-SNPs (de Bakker ; Gourraud ). In addition, multiple loci within the MHC can contribute to disease development, and are often in tight linkage disequilibrium with one another (Feng ; Rioux ). As the result of these factors, haplotype variation in the MHC exerts a greater effect on phenotypic outcomes than a specific single variant (Vandiedonck and Knight 2009). Indeed, numerous MHC haplotypes that are found in relatively high incidence in the general population are frequently implicated in various autoimmune disorders (Traherne ). One example is the A1-B8-DR3 MHC haplotype that is associated with type 1 diabetes, systemic lupus erythematosus, IgA deficiency, and celiac disease (Schroeder ; Price ). Yet, when compared with the causality of single variants to traits, our current understanding of how haplotype combinations affect phenotypic consequences is inadequate. A key component for understanding this complex relationship is through the analysis of MHC haplotype sequence variations (de Bakker and Raychaudhuri 2012; Larsen ). As such, the ability to demonstrate how MHC haplotype sequence variability can modify gene transcript abundance within the MHC will be an important step toward improving our understanding of the biological effects of specific MHC haplotypes. To understand the effect of MHC haplotype sequence diversity on gene expression phenotype, we conducted an integrated analysis using three genetic approaches: (a) characterization of MHC transcriptomics derived from RNA sequencing (RNA-Seq) of MHC-homozygous B-LCLs, (b) analysis of multiple MHC haplotype sequences and their polymorphisms at nucleotide resolution, and (c) mapping of known eQTL to these haplotype-specific polymorphisms. This approach allowed us to differentiate eQTL that are fixed within haplotypes, and enabled accurate mapping of potential regions and variants that impact gene regulation within the MHC. Through this integrated analysis, we demonstrate a correlation between eQTL and MHC haplotype sequence variation, and identify haplotype-specific eQTL combinations that exert control over gene expression within the MHC region.

Materials and Methods

Cell lines selection

Eight B-LCLs that are homozygous at the HLA-A, -B, -C, and DRB1 loci, were selected for RNA-Seq. Another 31 B-LCLs with at least one copy of their chromosomes including either the A2-B46-DR9 (n = 15) or A33-B58-DR3 (n = 16) haplotype were used to validate expression data from RNA-Seq. All cells were prepared and provided by the Singapore Immunology Network, A*STAR, Singapore and the Research Cell Bank and Fred Hutchinson Research Centre, Seattle, WA. HLA loci were typed using a sequence-based approach as described previously (Lam ). B-LCL cultures were stimulated for 6 hr at 37° with 200 nM of phorbol 12-myristate 13-acetate (Sigma) and 125 nM of ionomycin to activate the cells.

Quantitative real-time PCR (qPCR)

RNA extraction was performed using an RNeasy Mini kit (QIAGEN) according to the manufacturer’s instructions, and RNA was quantitated using a NanoDrop ND-1000 spectrophotometer (Thermo Fisher Scientific). Samples (100 ng/μl) were transcribed into cDNA using the AccuScript High Fidelity 1st Strand cDNA Synthesis Kit (Agilent). Gene-specific qPCR was performed for triplicate samples on the Applied Biosystems 7500 Real-Time PCR System using the KAPA SYBR FAST qPCR Master Mix (KAPA Biosystems). Gene-specific primers were designed using NCBI Primer-Blast. Primer sequences are available in Supplemental Material, Table S1. Relative quantities of RNA were measured using ΔΔCT normalized to HPRT levels. Differential gene expression between haplotypes was assessed using the Mann–Whitney U test.

RNA-Seq

The quality and concentration of total RNA, extracted using the RNeasy Mini Kit (QIAGEN), was measured using an Agilent Bioanalyzer 2100. Only samples with an RNA integrity number of >7.0 were selected for RNA-Seq. Prior to library preparation, ribosomal RNA was removed from samples using the human Ribo-Zero rRNA Removal Kit (Epicentre). Paired-end RNA-Seq libraries were generated using the Illumina TruSeq RNA library preparation kit v2 (Illumina, San Diego, CA). A total of 16 libraries were prepared (two libraries per cell line) and sequenced on the Illumina HiSeq 2000 machine at two libraries per lane.

Processing of RNA-Seq reads and expression analysis

Raw sequencing reads with 70% of base positions having a Phred score of <20 were removed prior to alignment using the NGS QC Toolkit (Patel and Jain 2012). The processed reads were then mapped to the hg19 human genome assembly using TopHat2 v2.0.13 (Trapnell ). Reads mapping to genomic features in the GENCODE release 19 (GRCh37.p13) gene annotation were summarized using the featureCounts program in the Subread package v1.5.0-p1 (Liao ). For a read pair to be assigned to a genomic feature, it has to be uniquely mapped to a single genomic location, both ends must be concordantly mapped, and it should have a mapping quality score ≥ 25. The DESeq2 v1.10.1 (Love ) was used to normalize the RNA-Seq libraries, and limma v3.26.8 (Ritchie ) was used to perform differential expression evaluation. Only loci with a sum of normalized read counts across all samples >10 were considered as expressed, and genomic regions mapping to multiple genomic features were not considered for differential expression analysis. The Benjamini and Hochberg method was used to control for multiple testing.

MHC haplotype sequence and data analysis

The Chinese A2-B46-DR9 and A33-B58-DR3 haplotype sequences were derived from Lam , and the BED file containing sequencing information for the A1-B8-DR3 haplotype was downloaded from https://www.ucl.ac.uk/cancer/research/department-cancer-biology/medical-genomics-group/past-projects/mhc-haplotype-project. The coordinates of the haplotype sequence were based on the Human Reference Sequence Assembly 37.2 (NCBI build 37.2). Haplotype sequence comparisons were performed using in-house written R-scripts (Lam ). Cis-acting eQTL SNPs specific to transformed B-LCLs isolated from healthy individuals were obtained from the Genotype-Tissue Expression (GTEx) consortium (Genotype-Tissue Expression Project 2013) database (Release V6). Only SNPs with an adjusted P-value < 0.05 were considered for analysis. Regularized log transformation, implemented by DESeq2, was applied to normalized raw counts, and was then used as input for hierarchical clustering and principal component analysis (PCA). Hierarchical clustering, PCA, k-means clustering, and calculation of Cohen’s κ coefficient were all implemented using packages in R.

Data availability

The gene expression data are available at GEO using the accession number GSE83403.

Results

MHC-wide gene expression patterns

The correlation of MHC-wide gene expression profiles with MHC haplotypes was examined by transcriptome sequencing of eight EBV-transformed B-LCLs that are HLA homozygous at the HLA-A, -B, -C, and DRB1 loci. Of the eight cell lines tested, three (B58AL, B58SC, and B58CF) carry the Chinese A*33:03-C*03:02-B*58:01-DRB1*03:01 (A33-B58-DR3) MHC haplotype; three (B46BM, B46ZS, and B46CM) carry the Chinese A*02:07-C*01:02-B*46:01-DRB1*09:01 (A2-B46-DR9) MHC haplotype; and two (ARC and COX) carry the European A*01:01-C*07:01-B*08:01-DRB1*03:01 (A1-B8-DR3) MHC haplotype. The sequences of these three MHC haplotypes were previously established and are found to exhibit features of conserved extended haplotypes (CEHs), where long stretches of dominant sequence across the entire MHC genomic region remain intact (Stewart ; Aly ; Lam ). The use of HLA homozygous cell lines of MHC haplotypes that include CEHs ensures that both copies of the chromosomes at the MHC region are representative of the haplotype of interest, and should provide an accurate depiction of the MHC region’s transcriptome profile of a particular MHC haplotype. We analyzed the expression of genes residing in the MHC genomic region from chr6:29.0–33.2 Mb. Under the GENCODE (GRCh37.p13) gene annotation, there are a total of 338 loci (inclusive of protein-coding, pseudogene, and noncoding RNAs) found within the MHC region, and 177 of these were found to be expressed in the EBV-transformed B-LCLs. To examine the transcriptome profiles of the genes expressed within the MHC region, regularized log values of the 177 loci were fitted for hierarchical clustering and PCA. From the analysis plots, we observed that samples carrying identical MHC haplotypes were distinctly clustered together (Figure 1A). In contrast, when a similar analysis was performed using the whole-genome transcriptomic data, no detectable cluster was observed (Figure 1B). We then performed a more refined comparison by analyzing the gene expression pattern within chromosome 1 as well as chromosome 6 (excluding the genes in the MHC region) as controls and again no detectable cluster was seen (Figure S1 in File S1). These results indicate that the clustering pattern derived from gene expression is exclusive to the MHC region, suggesting that MHC-wide gene expression patterns are associated with the underlying MHC haplotype carried by each individual.
Figure 1

MHC-wide gene expression patterns defined by MHC haplotypes. (A) Hierarchical and PC analysis based on expression of 177 MHC-residing loci found within chr6:29.0–33.2 Mb. (B) Hierarchical and PC analysis based on expression of genes in the human genome. Expression profiles were derived from RNA-sequencing of eight B-LCLs with two libraries generated per B-LCL. Blue indicates B-LCLs carrying the A2-B46-DR9 haplotype, red indicates B-LCLs carrying the A33-B58-DR3 haplotype, and green indicates B-LCLs carrying the A1-B8-DR3 haplotype. B-LCL, B-lymphoblastoid cell line; chr, chromosome; MHC, major histocompatibility complex; PC, principal component.

MHC-wide gene expression patterns defined by MHC haplotypes. (A) Hierarchical and PC analysis based on expression of 177 MHC-residing loci found within chr6:29.0–33.2 Mb. (B) Hierarchical and PC analysis based on expression of genes in the human genome. Expression profiles were derived from RNA-sequencing of eight B-LCLs with two libraries generated per B-LCL. Blue indicates B-LCLs carrying the A2-B46-DR9 haplotype, red indicates B-LCLs carrying the A33-B58-DR3 haplotype, and green indicates B-LCLs carrying the A1-B8-DR3 haplotype. B-LCL, B-lymphoblastoid cell line; chr, chromosome; MHC, major histocompatibility complex; PC, principal component.

Genes differentially expressed in the MHC region

Using the MHC haplotype transcriptome profiles generated from RNA-Seq, we proceeded to investigate MHC haplotype-specific gene expression. We identified 36 significantly differentially expressed genes across the three MHC haplotypes (adjusted P-value < 0.05) (Table S2); 27 genes between A2-B46-DR9 and A33-B58-DR3; 20 genes between A2-B46-DR9 and A1-B8-DR3; as well as 16 genes between A33-B58-DR3 and A1-B8-DR3 (Figure 2A). Of these genes, only the HLA-DRA gene was found to be differentially expressed in all three pairwise haplotype comparisons. From the examination of the top differentially expressed genes (adjusted P-value < 0.001), we observed that 9 out of 14 were classical HLA class I and class II genes (Figure 2B), suggesting that haplotype sequence variations have a greater impact on the level of the HLA gene expression than other loci residing within the MHC region. To validate the RNA-Seq results, we selected an independent cohort of B-LCLs where at least one copy of their chromosome has either the A2-B46-DR9 or A33-B58-DR3 haplotype, and quantified the level of expression of C6orf48, HLA-DQB2, and MICA in these B-LCLs using real-time PCR. The results obtained correlated with the trend observed by RNA-Seq, providing further evidence for MHC haplotype-specific transcriptomic signatures (Figure 2C).
Figure 2

Differential gene expression analysis between haplotypes. (A) Venn diagram indicating the number of significant differentially expressed genes (adjusted P-value < 0.05) between pairwise haplotype comparisons. (B) Top 15 differentially expressed genes with adjusted P-value < 0.001 in at least one of the pairwise haplotype comparisons. (C) Relative expression of C6orf48, HLA-DQB2, and MICA between the A2-B46-DR9 (n = 15) and A33-B58-DR3 haplotypes (n = 16) in B-LCLs (Mann–Whitney U test). * P-value < 0.05, *** P-value < 0.0001. HPRT, hypoxanthine phosphoribosyltransferase.

Differential gene expression analysis between haplotypes. (A) Venn diagram indicating the number of significant differentially expressed genes (adjusted P-value < 0.05) between pairwise haplotype comparisons. (B) Top 15 differentially expressed genes with adjusted P-value < 0.001 in at least one of the pairwise haplotype comparisons. (C) Relative expression of C6orf48, HLA-DQB2, and MICA between the A2-B46-DR9 (n = 15) and A33-B58-DR3 haplotypes (n = 16) in B-LCLs (Mann–Whitney U test). * P-value < 0.05, *** P-value < 0.0001. HPRT, hypoxanthine phosphoribosyltransferase.

MHC haplotype sequence variation controls gene expression

The presence of haplotype-specific differentially expressed genes observed in our study suggests an effect of haplotype sequence polymorphisms on gene regulation within the MHC region. To further examine this possible relationship, we first acquired cis-acting eQTL SNPs specific to EBV-transformed B-LCLs from the GTEx project (Carithers and Moore 2015), which are retained eQTL SNPs with a false discovery value < 0.05 and are reported to be associated with haplotype-specific differentially expressed genes. Of the 36 haplotype-specific differentially expressed genes identified above, 17 genes have reported eQTL SNPs that control their expression. Next, we obtained the haplotype sequence (chr6:29.0–33.2 Mb) representative of A2-B46-DR9, A33-B58-DR3, and A1-B8-DR3 established in published studies (Stewart ; Traherne ; Lam ), and performed pairwise haplotype sequence comparisons. We then mapped the haplotype sequence variations to the eQTL SNPs that influenced the expression of the 17 identified differentially expressed genes (Table S3), and we assessed these eQTL with the corresponding gene expression of the specific haplotype. We identified 481 haplotype sequence variations mapping to eQTL SNPs where the alternate alleles are associated with an upregulation of HLA-DQA2 gene expression. In the A2-B46-DR9 haplotype, 409/481 of these eQTL SNPs possess the alternate allele, while in the A33-B58-DR3 and A1-B8-DR3 haplotypes, 155/481 and 24/481 eQTL SNPs possess the alternate allele, respectively (Figure 3A). When we examined HLA-DQA2 expression, we found that the A2-B46-DR9 haplotype, which exhibits the highest proportion of alternate allele eQTL SNPs, also has significantly higher gene expression compared with the other haplotypes. We identified 204 haplotype sequence variations mapping to eQTL SNPs where the reference alleles are associated with an upregulation of HLA-DPA1 expression (Figure 3B). In this case, the A33-B58-DR3 and A1-B8-DR3 haplotypes exhibit the reference allele for 100% (n = 204) of the eQTL SNPs compared to 0% in the A2-B46-DR9 haplotype. Subsequently, we determined that HLA-DPA1 expression is significantly higher in both the A33-B58-DR3 and A1-B8-DR3 haplotypes than in the A2-B46-DR9 haplotype (Figure 2B).
Figure 3

Allelic proportions of the haplotype sequence variations mapping to eQTL SNPs and the expression level of the target gene for (A) HLA-DQA2 and (B) HLA-DPA1. In each panel, the top left side shows the number of eQTL SNPs with reference allele (blue) and alternate allele (pink) for each haplotype while the bottom left side shows the frequency of eQTL at each 5 kb window relative to the location of their target gene. The right side of each panel presents the gene expression level for each haplotype derived from eight MHC-homozygous B-LCLs. ** Adjusted P-value < 0.05. B-LCL, B-lymphoblastoid cell line; eQTL, expression quantitative trait loci; MHC, major histocompatibility complex; n.s, not significant; SNP, single nucleotide polymorphism.

Allelic proportions of the haplotype sequence variations mapping to eQTL SNPs and the expression level of the target gene for (A) HLA-DQA2 and (B) HLA-DPA1. In each panel, the top left side shows the number of eQTL SNPs with reference allele (blue) and alternate allele (pink) for each haplotype while the bottom left side shows the frequency of eQTL at each 5 kb window relative to the location of their target gene. The right side of each panel presents the gene expression level for each haplotype derived from eight MHC-homozygous B-LCLs. ** Adjusted P-value < 0.05. B-LCL, B-lymphoblastoid cell line; eQTL, expression quantitative trait loci; MHC, major histocompatibility complex; n.s, not significant; SNP, single nucleotide polymorphism. To identify if haplotype sequence variations at eQTL SNPs corresponded to gene expression for the 17 differentially regulated genes identified, we rescaled the expression of the 17 genes for evaluation among the three haplotypes and determined the allelic proportion of the haplotype sequence variations mapping to eQTL SNPs (Figure S2 in File S1). All 17 genes with reported eQTL information showed significant correlations between their expression and the allelic proportion of their haplotype sequence variations, with 13 correlating positively and 4 correlating inversely (Figure 4). Thus, we provide evidence of haplotype sequence variations that function as cis-acting regulatory variants, controlling expression levels of multiple genes across the MHC region.
Figure 4

Correlation of allelic proportion of eQTL to expression levels of genes that are differentially expressed between haplotypes. The expression of genes was rescaled relatively among the haplotypes. The size of the spheres represents the relative number of eQTL with reference allele (blue) or alternate allele (pink) for each haplotype. The (+)/(−) notations below each gene name indicates the direction of the correlation for each corresponding gene. The exact count of the allelic proportion of the eQTL for each gene is shown in Figure S2 in File S1. eQTL, expression quantitative trait loci; Max, maximum; Min, minimum.

Correlation of allelic proportion of eQTL to expression levels of genes that are differentially expressed between haplotypes. The expression of genes was rescaled relatively among the haplotypes. The size of the spheres represents the relative number of eQTL with reference allele (blue) or alternate allele (pink) for each haplotype. The (+)/(−) notations below each gene name indicates the direction of the correlation for each corresponding gene. The exact count of the allelic proportion of the eQTL for each gene is shown in Figure S2 in File S1. eQTL, expression quantitative trait loci; Max, maximum; Min, minimum.

eQTL allelic combinations embedded in long-range MHC haplotypes

Importantly, our study finds that multiple regulatory sites are fixed in specific allelic combinations and are embedded in the MHC haplotype to direct gene expression. On average, these haplotype-specific allelic combinations consist of 297 (median) eQTL sites where the majority are located within 200 kb of their target genes (Figure 5A). Genes such as C4A, HLA-DRB5, and HLA-DQA2 had >400 eQTL in linkage disequilibrium, extending to >350 kb. In total, 3857 eQTL were mapped to haplotype sequence variations that regulate the 17 identified differentially expressed genes. Interestingly, 805 of these sites were found to affect expression of >1 gene locus (Figure 5B), where 97.8% (787/805) of these sites were clustered to regions proximate to the HLA-C and HLA-DR regions; sites that regulated >2 genes were found exclusively in the HLA-DRB region (Figure S3 in File S1). Detailed analysis of the eQTL allelic sequence revealed haplotype-specific allelic combinations consisting of 10 eQTL covering 87 kb, which are associated with bidirectional expression levels of HLA-DRB1 and HLA-DQA2. The alternate allelic sequence of these 10 eQTL, carried by the A2-B46-DR9 haplotype, was associated with downregulation of HLA-DRB1 and upregulation of HLA-DQA2 expression, while the reference allelic sequence, embedded in the A33-B58-DR3 and A1-B8-DR3 haplotypes, displayed a reverse relationship (Figure 5C). Overall, the occurrence of multiple eQTL existing in specific allelic combinations embedded within an MHC haplotype suggests that these eQTL could cooperatively influence the expression of nearby genes.
Figure 5

Haplotype-specific eQTL allelic combinations. (A) Percentage of eQTL in each 10 kb bin from the gene transcription start site. Red line denotes the average percentage for the 17 differentially expressed genes. Gray lines denote the upper and lower bounds of the 95% C.I. (B) Bar charts illustrating the quantity of target genes regulated by eQTL. (C) Bidirectional expression level of HLA-DRB1 and HLA-DQA2 regulated by haplotype-specific eQTL allelic combinations. The A2-B46-DR9 haplotype holds the alternate allelic sequence (pink), and the A33-B58-DR3 and A1-B8-DR3 haplotypes hold the reference allelic sequence (blue) at the 10 eQTL sites. Chr, chromosome; eQTL, expression quantitative trait loci.

Haplotype-specific eQTL allelic combinations. (A) Percentage of eQTL in each 10 kb bin from the gene transcription start site. Red line denotes the average percentage for the 17 differentially expressed genes. Gray lines denote the upper and lower bounds of the 95% C.I. (B) Bar charts illustrating the quantity of target genes regulated by eQTL. (C) Bidirectional expression level of HLA-DRB1 and HLA-DQA2 regulated by haplotype-specific eQTL allelic combinations. The A2-B46-DR9 haplotype holds the alternate allelic sequence (pink), and the A33-B58-DR3 and A1-B8-DR3 haplotypes hold the reference allelic sequence (blue) at the 10 eQTL sites. Chr, chromosome; eQTL, expression quantitative trait loci.

Localized regions dictate MHC-wide expression patterns

To identify the major contributory regions within the MHC toward the observed haplotype-specific gene expression pattern, we binned chr6:29.0–33.2 Mb into windows of 200 kb and performed a cluster analysis based on the expression of genes located in each of the 200 kb windows. We then assessed the agreement of the 200 kb regional clustering result with the MHC-wide clustering pattern from the pairwise haplotypes comparison using Cohen’s κ coefficient metrics. We observed that gene expression clustering in four of the localized regions (29.8–30.0, 31.2–31.4, 31.8–32.2, and 32.4–33.0 Mb) corresponded with the MHC-wide cluster pattern (Figure 6A) and, with the exception of the 31.8–32.2 Mb region, three of these encompass HLA genes. In addition, the enrichment of haplotype-specific eQTL either within or proximate to these regions (Figure 6B) suggests that haplotype sequence variation in these regions is a possible contributory determinant in defining gene expression within the MHC region. Notably, the gene expression pattern in the HLA-DRB region for the A33-B58-DR3 and A1-B8-DR3 haplotypes cannot be differentiated by MHC haplotype. These haplotypes both carry the HLA-DRB1*03:01 allele and share almost identical sequence at the HLA-DRB region (Lam ). This corresponds with our observation that sequence similarity results in similar levels of gene expression between the two haplotypes. Together, this data highlights the importance of MHC haplotype variation on gene expression.
Figure 6

Regions that define MHC-wide expression patterns. (A) Pairwise haplotype comparisons of the agreement of 200kb window cluster patterns with MHC-wide cluster patterns. (B) Regions within the MHC region enriched with eQTL that dictate the transcription variability of the haplotype-specific differentially expressed genes (red). Gray bars indicate gene density in each 200-kb window. Chr, chromosome; eQTL, expression quantitative trait loci; MHC, major histocompatibility complex.

Regions that define MHC-wide expression patterns. (A) Pairwise haplotype comparisons of the agreement of 200kb window cluster patterns with MHC-wide cluster patterns. (B) Regions within the MHC region enriched with eQTL that dictate the transcription variability of the haplotype-specific differentially expressed genes (red). Gray bars indicate gene density in each 200-kb window. Chr, chromosome; eQTL, expression quantitative trait loci; MHC, major histocompatibility complex.

Discussion

We examined the MHC transcriptomic landscape at haplotype-specific resolution through RNA-Seq of MHC-homozygous B-LCLs that represent three prominent MHC haplotypes (A2-B46-DR9, A33-B58-DR3, and A1-B8-DR3), and showed that MHC-wide gene expression patterns were distinguished by the underlying MHC haplotype. We were able to circumvent phasing ambiguity concerns that arise from the study of diploid genomes through the use of cell lines that are MHC-homozygous carrying MHC haplotypes that display CEH properties. We identified 36 genes that were differentially expressed in the context of MHC haplotype, and further distinguished haplotype sequence variants that can function as regulatory variants and exert allele-specific gene expression control on these differentially expressed haplotype-specific genes. Via mapping of haplotype sequence variations to B-LCL-specific eQTL that are reported to regulate these differentially expressed genes, we identified correlations between unique allelic combinations of eQTL and expression levels of their target genes. Importantly, these allelic combinations of eQTL are structured in specific MHC haplotype sequences. A recent study has shown multiple cis-acting SNPs, presented as a haplotype, that influence gene expression variability in other parts of the human genome; these haplotypes are comprised of up to four SNP sites at most (Garnier ). Our study expanded the notion of a haplotypic effect on gene expression within a critically important region of the human genome, which is remarkable for its association with many complex traits and diseases. We demonstrate haplotype-specific allelic combinations within the MHC region, consisting of between 12 and 546 eQTL sites, where these eQTL can be in linkage disequilibrium at distances of >350 kb. Interestingly, we also found allelic combinations of eQTL that not only direct the expression level of >1 gene but also result in divergent expression of the target genes. An important aspect of this work is the intricate relationship of MHC haplotype sequence variations with eQTL and their subsequent effects on gene expression. We further describe hotspots within the MHC where eQTL are enriched and showed that haplotype-specific eQTL combinations influence the expression of genes in the HLA-A (29.8–30.0 Mb), HLA-C (31.2–31.4 Mb), C4A (31.8–32.2 Mb), and HLA-DRB (32.4–33.0 Mb) regions. This suggests that haplotype sequence variations within or proximate to these regions play an important role in exerting haplotype-specific gene regulatory effects, whereas the expression of genes in other parts of the MHC region are not as tightly controlled by MHC haplotype sequences. One could argue that such observations are due to the high gene density occurring in these hotspot regions; however, we also observed intervals within the MHC region (e.g., 31.4–31.8 Mb; Figure 6B) that are similarly gene-dense, but their expression is not associated with haplotype sequence variation. By examining the MHC transcriptomic landscape at the haplotype level, we observed that eQTL SNP alleles are segregated by haplotype. There are at least two possible models that accommodate these findings. The first is that the eQTL SNP alleles represent multiple cis-acting loci that independently influence gene expression, and that haplotypes combine multiple contributive genes resulting in greater gene expression variation across haplotypes. The second is that the high degree of sequence conservation within particular CEHs and their high frequency in the population skew the results of eQTL studies such that SNPs across a large DNA sequence remain associated with gene expression. Further investigation of the relationship between haplotype and eQTLs are required to inform human genetics approaches within the MHC. Our data demonstrates that MHC haplotype sequence diversity can impact transcription variability in the MHC region. Our integrated analysis, which relates haplotype sequence variations with eQTL SNPs, has enabled us to differentiate potential functional variants that have a direct effect on gene regulation from sequences that are merely in linkage disequilibrium. Through the analysis of MHC haplotype sequences, our study provides an approach for elucidating the association of MHC haplotypes with different phenotypes and presents a way forward for assessing disease risk related to MHC gene expression.

Supplementary Material

Supplemental material is available online at www.g3journal.org/lookup/suppl/doi:10.1534/g3.117.043828/-/DC1. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file.
  37 in total

1.  The Genotype-Tissue Expression (GTEx) Project.

Authors:  Latarsha J Carithers; Helen M Moore
Journal:  Biopreserv Biobank       Date:  2015-10       Impact factor: 2.300

2.  Multi-SNP analysis of MHC region: remarkable conservation of HLA-A1-B8-DR3 haplotype.

Authors:  Theresa A Aly; Elise Eller; Akane Ide; Katherine Gowan; Sunanda R Babu; Henry A Erlich; Marian J Rewers; George S Eisenbarth; Pamela R Fain
Journal:  Diabetes       Date:  2006-05       Impact factor: 9.461

3.  Susceptibility locus for IgA deficiency and common variable immunodeficiency in the HLA-DR3, -B8, -A1 haplotypes.

Authors:  H W Schroeder; Z B Zhu; R E March; R D Campbell; S M Berney; S A Nedospasov; R L Turetskaya; T P Atkinson; R C Go; M D Cooper; J E Volanakis
Journal:  Mol Med       Date:  1998-02       Impact factor: 6.354

Review 4.  The role of regulatory variation in complex traits and disease.

Authors:  Frank W Albert; Leonid Kruglyak
Journal:  Nat Rev Genet       Date:  2015-02-24       Impact factor: 53.242

Review 5.  Interrogating the major histocompatibility complex with high-throughput genomics.

Authors:  Paul I W de Bakker; Soumya Raychaudhuri
Journal:  Hum Mol Genet       Date:  2012-09-12       Impact factor: 6.150

6.  A second generation human haplotype map of over 3.1 million SNPs.

Authors:  Kelly A Frazer; Dennis G Ballinger; David R Cox; David A Hinds; Laura L Stuve; Richard A Gibbs; John W Belmont; Andrew Boudreau; Paul Hardenbol; Suzanne M Leal; Shiran Pasternak; David A Wheeler; Thomas D Willis; Fuli Yu; Huanming Yang; Changqing Zeng; Yang Gao; Haoran Hu; Weitao Hu; Chaohua Li; Wei Lin; Siqi Liu; Hao Pan; Xiaoli Tang; Jian Wang; Wei Wang; Jun Yu; Bo Zhang; Qingrun Zhang; Hongbin Zhao; Hui Zhao; Jun Zhou; Stacey B Gabriel; Rachel Barry; Brendan Blumenstiel; Amy Camargo; Matthew Defelice; Maura Faggart; Mary Goyette; Supriya Gupta; Jamie Moore; Huy Nguyen; Robert C Onofrio; Melissa Parkin; Jessica Roy; Erich Stahl; Ellen Winchester; Liuda Ziaugra; David Altshuler; Yan Shen; Zhijian Yao; Wei Huang; Xun Chu; Yungang He; Li Jin; Yangfan Liu; Yayun Shen; Weiwei Sun; Haifeng Wang; Yi Wang; Ying Wang; Xiaoyan Xiong; Liang Xu; Mary M Y Waye; Stephen K W Tsui; Hong Xue; J Tze-Fei Wong; Luana M Galver; Jian-Bing Fan; Kevin Gunderson; Sarah S Murray; Arnold R Oliphant; Mark S Chee; Alexandre Montpetit; Fanny Chagnon; Vincent Ferretti; Martin Leboeuf; Jean-François Olivier; Michael S Phillips; Stéphanie Roumy; Clémentine Sallée; Andrei Verner; Thomas J Hudson; Pui-Yan Kwok; Dongmei Cai; Daniel C Koboldt; Raymond D Miller; Ludmila Pawlikowska; Patricia Taillon-Miller; Ming Xiao; Lap-Chee Tsui; William Mak; You Qiang Song; Paul K H Tam; Yusuke Nakamura; Takahisa Kawaguchi; Takuya Kitamoto; Takashi Morizono; Atsushi Nagashima; Yozo Ohnishi; Akihiro Sekine; Toshihiro Tanaka; Tatsuhiko Tsunoda; Panos Deloukas; Christine P Bird; Marcos Delgado; Emmanouil T Dermitzakis; Rhian Gwilliam; Sarah Hunt; Jonathan Morrison; Don Powell; Barbara E Stranger; Pamela Whittaker; David R Bentley; Mark J Daly; Paul I W de Bakker; Jeff Barrett; Yves R Chretien; Julian Maller; Steve McCarroll; Nick Patterson; Itsik Pe'er; Alkes Price; Shaun Purcell; Daniel J Richter; Pardis Sabeti; Richa Saxena; Stephen F Schaffner; Pak C Sham; Patrick Varilly; David Altshuler; Lincoln D Stein; Lalitha Krishnan; Albert Vernon Smith; Marcela K Tello-Ruiz; Gudmundur A Thorisson; Aravinda Chakravarti; Peter E Chen; David J Cutler; Carl S Kashuk; Shin Lin; Gonçalo R Abecasis; Weihua Guan; Yun Li; Heather M Munro; Zhaohui Steve Qin; Daryl J Thomas; Gilean McVean; Adam Auton; Leonardo Bottolo; Niall Cardin; Susana Eyheramendy; Colin Freeman; Jonathan Marchini; Simon Myers; Chris Spencer; Matthew Stephens; Peter Donnelly; Lon R Cardon; Geraldine Clarke; David M Evans; Andrew P Morris; Bruce S Weir; Tatsuhiko Tsunoda; James C Mullikin; Stephen T Sherry; Michael Feolo; Andrew Skol; Houcan Zhang; Changqing Zeng; Hui Zhao; Ichiro Matsuda; Yoshimitsu Fukushima; Darryl R Macer; Eiko Suda; Charles N Rotimi; Clement A Adebamowo; Ike Ajayi; Toyin Aniagwu; Patricia A Marshall; Chibuzor Nkwodimmah; Charmaine D M Royal; Mark F Leppert; Missy Dixon; Andy Peiffer; Renzong Qiu; Alastair Kent; Kazuto Kato; Norio Niikawa; Isaac F Adewole; Bartha M Knoppers; Morris W Foster; Ellen Wright Clayton; Jessica Watkin; Richard A Gibbs; John W Belmont; Donna Muzny; Lynne Nazareth; Erica Sodergren; George M Weinstock; David A Wheeler; Imtaz Yakub; Stacey B Gabriel; Robert C Onofrio; Daniel J Richter; Liuda Ziaugra; Bruce W Birren; Mark J Daly; David Altshuler; Richard K Wilson; Lucinda L Fulton; Jane Rogers; John Burton; Nigel P Carter; Christopher M Clee; Mark Griffiths; Matthew C Jones; Kirsten McLay; Robert W Plumb; Mark T Ross; Sarah K Sims; David L Willey; Zhu Chen; Hua Han; Le Kang; Martin Godbout; John C Wallenburg; Paul L'Archevêque; Guy Bellemare; Koji Saeki; Hongguang Wang; Daochang An; Hongbo Fu; Qing Li; Zhen Wang; Renwu Wang; Arthur L Holden; Lisa D Brooks; Jean E McEwen; Mark S Guyer; Vivian Ota Wang; Jane L Peterson; Michael Shi; Jack Spiegel; Lawrence M Sung; Lynn F Zacharia; Francis S Collins; Karen Kennedy; Ruth Jamieson; John Stewart
Journal:  Nature       Date:  2007-10-18       Impact factor: 49.962

7.  Putative cis-regulatory drivers in colorectal cancer.

Authors:  Halit Ongen; Claus L Andersen; Jesper B Bramsen; Bodil Oster; Mads H Rasmussen; Pedro G Ferreira; Juan Sandoval; Enrique Vidal; Nicola Whiffin; Alexandra Planchon; Ismael Padioleau; Deborah Bielser; Luciana Romano; Ian Tomlinson; Richard S Houlston; Manel Esteller; Torben F Orntoft; Emmanouil T Dermitzakis
Journal:  Nature       Date:  2014-07-23       Impact factor: 49.962

8.  Complete MHC haplotype sequencing for common disease gene mapping.

Authors:  C Andrew Stewart; Roger Horton; Richard J N Allcock; Jennifer L Ashurst; Alexey M Atrazhev; Penny Coggill; Ian Dunham; Simon Forbes; Karen Halls; Joanna M M Howson; Sean J Humphray; Sarah Hunt; Andrew J Mungall; Kazutoyo Osoegawa; Sophie Palmer; Anne N Roberts; Jane Rogers; Sarah Sims; Yu Wang; Laurens G Wilming; John F Elliott; Pieter J de Jong; Stephen Sawcer; John A Todd; John Trowsdale; Stephan Beck
Journal:  Genome Res       Date:  2004-05-12       Impact factor: 9.043

9.  Genetic variability in the regulation of gene expression in ten regions of the human brain.

Authors:  Adaikalavan Ramasamy; Daniah Trabzuni; Sebastian Guelfi; Vibin Varghese; Colin Smith; Robert Walker; Tisham De; Lachlan Coin; Rohan de Silva; Mark R Cookson; Andrew B Singleton; John Hardy; Mina Ryten; Michael E Weale
Journal:  Nat Neurosci       Date:  2014-08-31       Impact factor: 24.884

10.  TopHat: discovering splice junctions with RNA-Seq.

Authors:  Cole Trapnell; Lior Pachter; Steven L Salzberg
Journal:  Bioinformatics       Date:  2009-03-16       Impact factor: 6.937

View more
  11 in total

Review 1.  The MHC in the era of next-generation sequencing: Implications for bridging structure with function.

Authors:  Effie W Petersdorf; Colm O'hUigin
Journal:  Hum Immunol       Date:  2018-10-12       Impact factor: 2.850

2.  Expression estimation and eQTL mapping for HLA genes with a personalized pipeline.

Authors:  Vitor R C Aguiar; Jônatas César; Olivier Delaneau; Emmanouil T Dermitzakis; Diogo Meyer
Journal:  PLoS Genet       Date:  2019-04-22       Impact factor: 5.917

3.  Nongenic cancer-risk SNPs affect oncogenes, tumour-suppressor genes, and immune function.

Authors:  Maud Fagny; John Platig; Marieke Lydia Kuijjer; Xihong Lin; John Quackenbush
Journal:  Br J Cancer       Date:  2019-12-06       Impact factor: 7.640

4.  TNF-Block Genotypes Influence Susceptibility to HIV-Associated Sensory Neuropathy in Indonesians and South Africans.

Authors:  Jessica Gaff; Fitri Octaviana; Prinisha Pillay; Huguette Gaelle Ngassa Mbenda; Ibnu A Ariyanto; June Anne Gan; Catherine L Cherry; Peter Kamerman; Simon M Laws; Patricia Price
Journal:  Int J Mol Sci       Date:  2020-01-07       Impact factor: 5.923

5.  Global discovery of lupus genetic risk variant allelic enhancer activity.

Authors:  Xiaoming Lu; Xiaoting Chen; Carmy Forney; Omer Donmez; Daniel Miller; Sreeja Parameswaran; Ted Hong; Yongbo Huang; Mario Pujato; Tareian Cazares; Emily R Miraldi; John P Ray; Carl G de Boer; John B Harley; Matthew T Weirauch; Leah C Kottyan
Journal:  Nat Commun       Date:  2021-03-12       Impact factor: 17.694

6.  HLA RNA Sequencing With Unique Molecular Identifiers Reveals High Allele-Specific Variability in mRNA Expression.

Authors:  Tiira Johansson; Dawit A Yohannes; Satu Koskela; Jukka Partanen; Päivi Saavalainen
Journal:  Front Immunol       Date:  2021-02-25       Impact factor: 7.561

7.  Systematic genetic analysis of the MHC region reveals mechanistic underpinnings of HLA type associations with disease.

Authors:  Matteo D'Antonio; Joaquin Reyna; David Jakubosky; Margaret Kr Donovan; Marc-Jan Bonder; Hiroko Matsui; Oliver Stegle; Naoki Nariai; Agnieszka D'Antonio-Chronowska; Kelly A Frazer
Journal:  Elife       Date:  2019-11-20       Impact factor: 8.713

8.  Gene expression variability in human and chimpanzee populations share common determinants.

Authors:  Benjamin Jung Fair; Lauren E Blake; Abhishek Sarkar; Bryan J Pavlovic; Claudia Cuevas; Yoav Gilad
Journal:  Elife       Date:  2020-10-21       Impact factor: 8.140

9.  Differential expression of predisposing HLA-DQ2.5 alleles in DR5/DR7 celiac disease patients affects the pathological immune response to gluten.

Authors:  Laura Pisapia; Stefania Picascia; Federica Farina; Pasquale Barba; Carmen Gianfrani; Giovanna Del Pozzo
Journal:  Sci Rep       Date:  2020-10-14       Impact factor: 4.379

10.  Clinicogenomic factors of biotherapy immunogenicity in autoimmune disease: A prospective multicohort study of the ABIRISK consortium.

Authors:  Signe Hässler; Delphine Bachelet; Julianne Duhaze; Natacha Szely; Aude Gleizes; Salima Hacein-Bey Abina; Orhan Aktas; Michael Auer; Jerôme Avouac; Mary Birchler; Yoram Bouhnik; Olivier Brocq; Dorothea Buck-Martin; Guillaume Cadiot; Franck Carbonnel; Yehuda Chowers; Manuel Comabella; Tobias Derfuss; Niek De Vries; Naoimh Donnellan; Abiba Doukani; Michael Guger; Hans-Peter Hartung; Eva Kubala Havrdova; Bernhard Hemmer; Tom Huizinga; Kathleen Ingenhoven; Poul Erik Hyldgaard-Jensen; Elizabeth C Jury; Michael Khalil; Bernd Kieseier; Anna Laurén; Raija Lindberg; Amy Loercher; Enrico Maggi; Jessica Manson; Claudia Mauri; Badreddine Mohand Oumoussa; Xavier Montalban; Maria Nachury; Petra Nytrova; Christophe Richez; Malin Ryner; Finn Sellebjerg; Claudia Sievers; Dan Sikkema; Martin Soubrier; Sophie Tourdot; Caroline Trang; Alessandra Vultaggio; Clemens Warnke; Sebastian Spindeldreher; Pierre Dönnes; Timothy P Hickling; Agnès Hincelin Mery; Matthieu Allez; Florian Deisenhammer; Anna Fogdell-Hahn; Xavier Mariette; Marc Pallardy; Philippe Broët
Journal:  PLoS Med       Date:  2020-10-30       Impact factor: 11.069

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.