Literature DB >> 25569183

A genome-wide association study of marginal zone lymphoma shows association to the HLA region.

Joseph Vijai1, Zhaoming Wang2, Sonja I Berndt3, Christine F Skibola4, Susan L Slager5, Silvia de Sanjose6, Mads Melbye7, Bengt Glimelius8, Paige M Bracci9, Lucia Conde4, Brenda M Birmann10, Sophia S Wang11, Angela R Brooks-Wilson12, Qing Lan3, Paul I W de Bakker13, Roel C H Vermeulen14, Carol Portlock1, Stephen M Ansell15, Brian K Link16, Jacques Riby4, Kari E North17, Jian Gu18, Henrik Hjalgrim19, Wendy Cozen20, Nikolaus Becker21, Lauren R Teras22, John J Spinelli23, Jenny Turner24, Yawei Zhang25, Mark P Purdue3, Graham G Giles26, Rachel S Kelly27, Anne Zeleniuch-Jacquotte28, Maria Grazia Ennas29, Alain Monnereau30, Kimberly A Bertrand31, Demetrius Albanes3, Tracy Lightfoot32, Meredith Yeager2, Charles C Chung3, Laurie Burdett2, Amy Hutchinson2, Charles Lawrence33, Rebecca Montalvan33, Liming Liang34, Jinyan Huang35, Baoshan Ma36, Danylo J Villano1, Ann Maria1, Marina Corines1, Tinu Thomas1, Anne J Novak15, Ahmet Dogan37, Mark Liebow15, Carrie A Thompson15, Thomas E Witzig15, Thomas M Habermann15, George J Weiner16, Martyn T Smith38, Elizabeth A Holly9, Rebecca D Jackson39, Lesley F Tinker40, Yuanqing Ye18, Hans-Olov Adami41, Karin E Smedby42, Anneclaire J De Roos43, Patricia Hartge3, Lindsay M Morton3, Richard K Severson44, Yolanda Benavente6, Paolo Boffetta45, Paul Brennan46, Lenka Foretova47, Marc Maynadie48, James McKay49, Anthony Staines50, W Ryan Diver22, Claire M Vajdic51, Bruce K Armstrong52, Anne Kricker52, Tongzhang Zheng25, Theodore R Holford53, Gianluca Severi54, Paolo Vineis55, Giovanni M Ferri56, Rosalia Ricco57, Lucia Miligi58, Jacqueline Clavel59, Edward Giovannucci60, Peter Kraft34, Jarmo Virtamo61, Alex Smith32, Eleanor Kane32, Eve Roman32, Brian C H Chiu62, Joseph F Fraumeni3, Xifeng Wu18, James R Cerhan5, Kenneth Offit1, Stephen J Chanock3, Nathaniel Rothman3, Alexandra Nieters63.   

Abstract

Marginal zone lymphoma (MZL) is the third most common subtype of B-cell non-Hodgkin lymphoma. Here we perform a two-stage GWAS of 1,281 MZL cases and 7,127 controls of European ancestry and identify two independent loci near BTNL2 (rs9461741, P=3.95 × 10(-15)) and HLA-B (rs2922994, P=2.43 × 10(-9)) in the HLA region significantly associated with MZL risk. This is the first evidence that genetic variation in the major histocompatibility complex influences MZL susceptibility.

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 25569183      PMCID: PMC4287989          DOI: 10.1038/ncomms6751

Source DB:  PubMed          Journal:  Nat Commun        ISSN: 2041-1723            Impact factor:   17.694


Marginal zone lymphoma (MZL) encompasses a group of lymphomas that originate from marginal zone B cells present in extranodal tissue and lymph nodes. Three subtypes of MZL have been defined, extranodal MZL of mucosa-associated lymphoid tissue (MALT), splenic MZL and nodal MZL, which together account for 7–12% of all non-Hodgkin lymphoma (NHL) cases. Geographic differences in incidence have been observed1, and inflammation, immune system dysregulation and infectious agents, such as Helicobacter pylori, have been implicated particularly for the gastric MALT subtype2, but little else is known of MZL aetiology. Here we perform the first two-stage, subtype-specific genome-wide association study (GWAS) of MZL and identify two independent single-nucleotide polymorphisms (SNPs) within the HLA region associated with MZL risk. Together with recent studies on other common subtypes of NHL, these results point to shared susceptibility loci for lymphoma in the HLA region.

Results

Stage 1 MZL GWAS

As part of a larger NHL GWAS, 890 MZL cases and 2,854 controls from 22 studies in the United States and Europe (Supplementary Table 1) were genotyped using the Illumina OmniExpress array. Genotype data from the Illumina Omni2.5 was also available for 3,536 controls from three of the 22 studies3. After applying rigorous quality control filters (Supplementary Table 2, Methods), data for 611,856 SNPs with minor allele frequency (MAF)>1% in 825 cases and 6,221 controls of European ancestry (Supplementary Fig. 1) remained for the stage 1 analysis (Supplementary Table 3). To discover variants associated with risk, logistic regression analysis was performed on these SNPs adjusting for age, gender and three significant eigenvectors computed using principal components analysis (Supplementary Fig. 2, Methods). Examination of the quantile–quantile (Q–Q) plot (Supplementary Fig. 3) showed minimal detectable evidence for population substructure (λ=1.01) with some excess of small P values. A Manhattan plot revealed association signals at the HLA region (Supplementary Fig. 4; 6p21.33:31,061,211–32,620,572) on chromosome 6 reaching genome-wide significance. Removal of all SNPs in the HLA region resulted in an attenuation of the excess of small P values observed in the Q–Q plot, although some excess still remained. To further explore associations within the HLA region and identify other regions potentially associated with risk, common SNPs available in the 1000 Genomes project data release 3 were imputed (Methods).

Stage 2 genotyping

Ten SNPs in promising loci with P≤7.5 × 10−6 in the stage 1 discovery were selected for replication (stage 2) in an additional 456 cases and 906 controls of European ancestry (Supplementary Tables 1 and 3). Of the SNPs selected for replication, two SNPs were directly genotyped on the OmniExpress, while the remaining eight were imputed with high accuracy (median info score=0.99) in stage 1 (Supplementary Table 4). Replication was carried out using Taqman genotyping. In the combined meta-analysis of 1,281 cases and 7,127 controls, we identified two distinct loci (Table 1, Fig. 1, Supplementary Table 4) at chromosomes 6p21.32 and 6p21.33 that reached the threshold of genome-wide statistical significance (P<5 × 10−8). These are rs9461741 in the butyrophilin-like 2 (MHC class II associated) (BTNL2) gene at 6p21.32 in HLA class II (P=3.95 × 10−15, odds ratio (OR)=2.66, confidence interval (CI)=2.08–3.39) and rs2922994 at 6p21.33 in HLA class I (P=2.43 × 10−9, OR=1.64, CI=1.39–1.92). These two SNPs were weakly correlated (r2=0.008 in 1000 Genomes CEU population), and when both were included in the same statistical model, both SNPs remained strongly associated with MZL risk (rs9461741, P=2.09 × 10−15; rs2922994, P=6.03 × 10−10), suggesting that the two SNPs are independent. Both SNPs were weakly correlated with other SNPs in the HLA region previously reported to be associated with other NHL subtypes or Hodgkin lymphoma (r2<0.14 for all pairwise comparisons). None of the previously reported SNPs were significantly associated with MZL risk after adjustment for multiple testing (P<0.0025) in our study, suggesting the associations are subtype-specific (Supplementary Table 5). Another SNP rs7750641 (P=3.34 × 10−8; Supplementary Table 4) in strong linkage disequilibrium (LD) with rs2922994 (r2=0.85) also showed promising association with MZL risk. rs7750641 is a missense variant in transcription factor 19 (TCF19), which encodes a DNA-binding protein implicated in the transcription of genes during the G1–S transition in the cell cycle4. The non-HLA SNPs genotyped in stage 2 were not associated with MZL risk (Supplementary Table 4).
Table 1

Association results for two new independent SNPs with MZL in a two-stage GWAS.

ChrNearest gene(s)SNPPosition*Risk alleleOther alleleRAFStageNo. of cases/no. of controlsOR95% CIP value§PheterogeneityI2
6p21.32BTNL2rs946174132370587CG0.018Stage 1824/6,2202.40(1.74–3.31)9.11E-08  
      0.030Stage 2453/8773.06(2.10–4.46)5.24E-09  
       Combined1,277/7,0972.66(2.08–3.39)3.95E-150.21634.69
6p21.33HLA-Brs292299431335901GA0.113Stage 1825/6,2211.74(1.43–2.12)2.89E-08  
      0.094Stage 2405/8321.43(1.08–1.90)0.01  
       Combined1,230/7,0531.64(1.39–1.92)2.43E-090.5070

CI, confidence interval; GWAS, genome-wide association study; MZL, marginal zone lymphoma; OR, odds ratio; RAF, risk allele frequency; SNP, single-nucleotide polymorphism.

*Position according to human reference NCBI37/hg19.

†Allele associated with an increased risk of MZL.

‡Risk allele frequency in controls.

§For stage 1 and 2, P values were generated by using logistic regression. For the combined stage, the odds ratio and P values were generated using a fixed effects model. Heterogeneity in the effect estimates was assessed using Cochran’s Q statistic and estimating the I statistic.

Figure 1

Regional plot showing the HLA associations with MZL.

The figure shows the association log10 P values from the log-additive genetic model for all SNPs in the region from stage 1 (dots) (n=825 cases, n=6,221 controls) and the log10 P values from the log-additive genetic model for both stage 1 and 2 combined (purple diamonds) for rs2922994 (n=1,230 cases, n=7,053 controls) and rs9461741 (n=1,277 cases, n=7,097 controls). The purple dots show the log10 P values of these SNPs in stage 1. Top panel (a) shows the region encompassing both SNPs. Bottom panel (b) regional plot of the most significant SNP rs2922994 at 6p21.33 (c) and rs9461741 at 6p21.32. The colours of the dots reflect the LD (as measured by r2) with the most significant SNP as shown in the legend box.

HLA alleles

To obtain additional insight into plausible functional variants, we imputed the classical HLA alleles and amino acid residues using SNP2HLA5 (Methods). No imputed HLA alleles or amino acid positions reached genome-wide significance (Supplementary Table 6). However, for HLA class I, the most promising associations were observed with HLA-B*08 (P=7.94 × 10−8), HLA-B*08:01 (P=7.79 × 10−8) and the HLA-B allele encoding an aspartic acid residue at position 9 (Asp9) (P=7.94 × 10−8), located in the peptide binding groove of the protein. HLA-B*08:01 and Asp9 are highly correlated (r2≥0.99), and thus their effect sizes were identical (OR=1.67, 95% CI: 1.38–2.01). They are both also in strong LD with rs2922994 (r2=0.97). Due to the fact that they are collinear, the effects of the SNPs and alleles were indistinguishable from one another in conditional modelling. For HLA class II, a suggestive association was observed with HLA-DRB1*01:02 (OR=2.24, 95% CI: 1.64–3.07, P=5.08 × 10−7; Supplementary Table 6), which is moderately correlated with rs9461741 (r2=0.69). Conditional analysis revealed that the effects of rs9461741 (the intragenic SNP in BTNL2) and HLA-DRB1*01:02 were indistinguishable statistically (stage 1: rs9461741, Padjusted=0.064 and HLA-DRB1*01:02, Padjusted=0.29). A model containing both HLA-B*08:01 and HLA-DRB1*01:02 showed that the two alleles were independent (HLA-B*08:01: Padjusted=4.65 × 10−8 and HLA-DRB1*01:02: Padjusted=2.97 × 10−7), further supporting independent associations in HLA class I and II loci.

MALT versus non-MALT

Heterogeneity between the largest subtype of MZL, namely MALT and other subtypes grouped as non-MALT, was evaluated for the MZL associated SNPs (Supplementary Table 7). The effects were slightly stronger for MALT, but no evidence for substantial heterogeneity was observed (Pheterogeneity≥0.05). Studies have suggested that H. pylori infection is a risk factor for gastric MZL2. An examination of SNPs previously suggested to be associated with H. pylori infection in independent studies6 did not reveal any significant association with the combined MZL or the MALT subtype in this study (Supplementary Table 8). Toll-like receptors (TLR) are considered strong candidates in mediating inflammatory immune response to pathogenic insults. A previous study reported7 a nominally significant association with rs4833103 in the TLR10–TLR1–TLR6 region with MZL risk. After excluding the cases and controls in the previous report7, we found little additional support for this locus (MZL: P=0.006, OR=1.18 and MALT: P=0.38, OR=1.08).

Secondary functional analyses

To gain additional insight into potential biological mechanisms, expression quantitative trait loci (eQTL) analyses were performed in two datasets consisting of lymphoblastoid cell lines (Methods). Significant associations were seen for rs2922994 and rs7750641with HLA-B and HLA-C (Supplementary Table 9) while suggestive associations (false discovery rate, FDR≤0.05) for correlated SNPs of rs2922994 (r2>0.8) in HLA class I and RNF5 (Supplementary Table 10) were observed. No significant eQTL association was observed for rs9461741 or other correlated HLA class II SNPs. Chromatin state analysis (Methods) using ENCODE data revealed correlated SNPs of rs2922994 showed a chromatin state consistent with the prediction for an active promoter (rs3094005) or satisfied the state of a weak promoter (rs2844577) in the lymphoblastoid cell line GM12878 (Supplementary Fig. 5). GM12878 is the only lymphoblastoid cell line from which high-quality whole-genome annotation data for chromatin state is readily available. Analyses were also performed with HaploReg (Supplementary Table 11) and RegulomeDB (Supplementary Table 12) that showed overlap of the SNPs with functional motifs, suggesting plausible roles in gene regulatory processes.

Discussion

The most statistically significant SNP associated with MZL, rs9461741, is located in HLA class II in the intron between exons 3 and 4 of the BTNL2 gene. BTNL2 is highly expressed in lymphoid tissues8 and has close homology to the B7 co-stimulatory molecules, which initiate lymphocyte activation as part of antigen presentation. Evidence is consistent with BTNL2 acting as a negative regulator of T-cell proliferation and cytokine production89 and attenuating T-cell-mediated responses in the gut10. We were unable to statistically differentiate the effects of rs9461741 from HLA-DRB1*01:02 and, thus, our observed association could be due to HLA-DRB1. HLA-DRB1 has been shown to be associated with other autoimmune diseases, including rheumatoid arthritis11 and selective IgA deficiency12. Similarly, rs2922994 is located 11 kb upstream of HLA-B, which is known to play a critical role in the immune system by presenting peptides derived from the endoplasmic reticulum lumen. rs7750641, a missense variant in TCF19, was previously associated with pleiotropic effects on blood-based phenotypes13 and is highly expressed in germinal center cells and up-regulated in human pro-B and pre-B cells14. Autoimmune diseases, such as Sjögren’s syndrome and systemic lupus erythematosus, are established risk factors for developing MZL, with the strongest associations seen between Sjögren's syndrome and the MALT subtype15. Of note, SNPs in HLA-B and the classical alleles HLA-DRB1*01:02 are strongly associated with Sjögren’s syndrome16, while HLA-DRB1*03 has been associated with rheumatoid arthritis17. The multiple independent associations in the HLA region and their localization to known functional autoimmune and B-cell genes suggest possible shared genetic effects that span both lymphoid cancers and autoimmune diseases. Chronic autoimmune stimulation leading to over-activity and defective apoptosis of B cells, and secondary inflammation events triggered by genetic and environmental factors are biological mechanisms that may contribute to the pathogenesis of MZL. We have performed the largest GWAS of MZL to date and identified two independent SNPs within the HLA region that are robustly associated with the risk of MZL. In addition to the known diversity in etiology and pathology, there is mounting evidence of genetic heterogeneity across the NHL subtypes of lymphoma. However, the HLA region appears to be commonly associated with multiple major subtypes, such as MZL, CLL18, DLBCL19 and FL20212223. Further studies are needed to identify biological mechanisms underlying these relationships and advance our knowledge regarding their interactions with associated environmental factors that may modulate disease risks.

Methods

Stage 1 MZL GWAS study subjects and ethics

As part of a larger NHL GWAS initiative, we conducted a GWAS of MZL using 890 cases and 2,854 controls of European descent from 22 studies of NHL (Supplementary Table 1 and Supplementary Table 2), including nine prospective cohort studies, eight population-based case–control studies, and five clinic or hospital-based case–control studies. All studies were approved by the respective Institutional Review Boards as listed. These are ATBC:(NCI Special Studies Institutional Review Board), BCCA: UBC BC Cancer Agency Research Ethics Board, CPS-II: American Cancer Society, ELCCS: Northern and Yorkshire Research Ethics Committee, ENGELA: IRB00003888—Comite d’ Evaluation Ethique de l'Inserm IRB # 1, EPIC: Imperial College London, EpiLymph: International Agency for Research on Cancer, HPFS: Harvard School of Public Health (HSPH) Institutional Review Board, Iowa-Mayo SPORE: University of Iowa Institutional Review Board, Italian GxE: Comitato Etico Azienda Ospedaliero Universitaria di Cagliari, Mayo Clinic Case–Control: Mayo Clinic Institutional Review Board, MCCS: Cancer Council Victoria’s Human Research Ethics Committee, MD Anderson: University of Texas MD Anderson Cancer Center Institutional Review Board, MSKCC: Memorial Sloan-Kettering Cancer Center Institutional Review Board, NCI-SEER (NCI Special Studies Institutional Review Board), NHS: Partners Human Research Committee, Brigham and Women's Hospital, NSW: NSW Cancer Council Ethics Committee, NYU-WHS: New York University School of Medicine Institutional Review Board, PLCO: (NCI Special Studies Institutional Review Board), SCALE: Scientific Ethics Committee for the Capital Region of Denmark, SCALE: Regional Ethical Review Board in Stockholm (Section 4) IRB#5, UCSF2: University of California San Francisco Committee on Human Research, WHI: Fred Hutchinson Cancer Research Center, Yale: Human Investigation Committee, Yale University School of Medicine. Informed consent was obtained from all participants. Cases were ascertained from cancer registries, clinics or hospitals or through self-report verified by medical and pathology reports. To determine the NHL subtype, phenotype data for all NHL cases were reviewed centrally at the International Lymphoma Epidemiology Consortium (InterLymph) Data Coordinating Center and harmonized using the hierarchical classification proposed by the InterLymph Pathology Working Group2425 based on the World Health Organization (WHO) classification26.

Genotyping and quality control

All MZL cases with sufficient DNA (n=890) and a subset of controls (n=2,854) frequency matched by age, sex and study to the entire group of NHL cases, along with 4% quality control duplicates, were genotyped on the Illumina OmniExpress at the NCI Core Genotyping Resource (CGR). Genotypes were called using Illumina GenomeStudio software, and quality control duplicates showed >99% concordance. Monomorphic SNPs and SNPs with a call rate of <95% were excluded. Samples with a call rate of ≤93%, mean heterozygosity <0.25 or >0.33 based on the autosomal SNPs or gender discordance (>5% heterozygosity on the X chromosome for males and <20% heterozygosity on the X chromosome for females) were excluded. Furthermore, unexpected duplicates (>99.9% concordance) and first-degree relatives based on identity by descent sharing with Pi-hat >0.40 were excluded. Ancestry was assessed using the Genotyping Library and Utilities (GLU- http://code.google.com/p/glu-genetics/) struct.admix module based on the method by Pritchard et al.27 and participants with <80% European ancestry were excluded (Supplementary Fig. 1). After exclusions, 825 cases and 2,685 controls remained (Supplementary Table 2). Genotype data previously generated on the Illumina Omni2.5 from an additional 3,536 controls from three of the 22 studies (ATBC, CPS-II and PLCO) were also included3, resulting in a total of 825 cases and 6,221 controls for the stage 1 analysis (Supplementary Table 3). Of these additional 3,536 controls, 703 (~235 from each study) were selected to be representative of their cohort and cancer free3, while the remainder were cancer-free controls from an unpublished study of prostate cancer in the PLCO. SNPs with call rate <95%, with Hardy–Weinberg equilibrium P<1 × 10−6, or with a MAF <1% were excluded from analysis, leaving 611,856 SNPs for analysis. To evaluate population substructure, a principal components analysis was performed using the Genotyping Library and Utilities (GLU), version 1.0, struct.pca module, which is similar to EIGENSTRAT28 - http://genepath.med.harvard.edu/~reich/Software.htm. Plots of the first five principal components are shown in Supplementary Fig. 2. Genomic inflation factor was computed prior (λ=1.014) and after removal of SNPs in the HLA loci (λ=1.010). Association testing was conducted assuming a log-additive genetic model, adjusting for age, sex and three significant principal components. All data analyses and management were conducted using GLU.

Imputation of variants

To more comprehensively evaluate the genome for SNPs associated with MZL, SNPs in the stage 1 discovery GWAS were imputed using IMPUTE2 (ref. 29)- http://mathgen.stats.ox.ac.uk/impute/impute_v2.html and the 1000 Genomes Project (1kGP- http://www.1000genomes.org/) version 3 data2930. SNPs with a MAF <1% or information quality score (info) <0.3 were excluded from analysis, leaving 8,478,065 SNPs for association testing. Association testing on the imputed data was conducted using SNPTEST31— https://mathgen.stats.ox.ac.uk/genetics_software/snptest/snptest.html version 2, assuming dosages for the genotypes and adjusting for age, sex and three significant principal components. In a null model for MZL risk, the three eigenvectors EV1, EV3 and EV8 were nominally associated with MZL risk and hence were included to account for potential population stratification. Heterogeneity between MZL subtypes was assessed using a case–case comparison, adjusting for age, sex and significant principal components.

Stage 2 replication of SNPs from the GWAS

After ranking the SNPs by P value and LD filtering (r<0.05), 10 SNPs from the most promising loci identified from stage 1 after imputation with P<7.5 × 10−6 were taken forward for de novo replication in an additional 456 cases and 906 controls (Supplementary Tables 1 and 4). Wherever possible, we selected either the best directly genotyped SNP or the most significant imputed SNP for the locus. Only imputed SNPs with an information score >0.8 were considered for replication. Only SNPs with MAF >1% were selected for replication, and no SNPs were taken forward for replication in regions where they appeared as singletons or obvious artifacts. For the HLA region, we selected one additional SNP (rs7750641) that was highly correlated with rs2922994 for additional confirmation. Genotyping was conducted using custom TaqMan genotyping assays (Applied Biosystems) validated at the NCI Core Genotyping Resource. Genotyping was done at four centres. HapMap control samples genotyped across two centres yielded 100% concordance as did blind duplicates (~5% of total samples). Due to the small number of samples, the MD Anderson, Mayo and NCI replication studies were pooled together for association testing; however, MSKCC samples were analysed separately to account for the available information on Ashkenazi ancestry. Association results were adjusted for age and gender and study site in the pooled analysis. The results from the stage 1 and stage 2 studies were then combined using a fixed effect meta-analysis method with inverse variance weighting based on the estimates and s.e. from each study. Heterogeneity in the effect estimates across studies was assessed using Cochran’s Q statistic and estimating the I2 statistic. For all SNPs that reached genome-wide significance in Table 1, no substantial heterogeneity was observed among the studies (Pheterogeneity≥0.1 for all SNPs, Supplementary Table 4).

Technical validation of imputed SNPs

Genotyping was conducted using custom TaqMan genotyping assays (Applied Biosystems) at the NCI Cancer Genomics Research Laboratory on a set of 470 individuals included in the stage 1 MZL GWAS. The allelic dosage r2 was calculated between the imputed genotypes and the technical validation done using assayed genotypes which showed that both SNPs were imputed with high accuracy (INFO ≥0.99) and a high correlation (r2≥0.99) between dosage imputation and genotypes obtained by Taqman assays.

HLA imputation and analysis

To determine if specific coding variants within HLA genes contributed to the diverse association signals, we imputed the classical HLA alleles (A, B, C, DQA1, DQB1, DRB1) and coding variants across the HLA region (chr6:20–40 Mb) using SNP2HLA5- http://www.broadinstitute.org/mpg/snp2hla/. The imputation was based on a reference panel from the Type 1 Diabetes Genetics Consortium (T1DGC) consisting of genotype data from 5,225 individuals of European descent who were typed for HLA-A, B, C, DRB1, DQA1, DQB1, DPB1, DPA1 4-digit alleles. Imputation accuracy of HLA alleles was assessed by comparing HLA alleles to the HLA sequencing data on a subset of samples from the NCI32. The concordance rates obtained were 97.32, 98.5, 98.14 and 97.49% for HLA-A, B, C and DRB1, respectively, in the NCI GWAS suggesting robust performance of the imputation method. Due to the limited number of SNPs (7,253) in the T1DGC reference set, imputation of HLA SNPs was conducted with IMPUTE2 and the 1kGP reference set as described above. A total of 68,488 SNPs, 201 classical HLA alleles (two- and four-digit resolution) and 1,038 AA markers including 103 AA positions that were ‘multi-allelic’ with three to six different residues present at each position, were successfully imputed (info score >0.3 for SNPs or r>0.3 for alleles and AAs) and available for downstream analysis. Multi-allelic markers were analysed as binary markers (for example, allele present or absent) and a meta-analysis was conducted where we tested SNPs, HLA alleles and AAs across the HLA region for association with MZL using PLINK33 or SNPTEST31 as described above.

eQTL analysis

We conducted an eQTL analysis using two independent datasets: childhood asthma34 and HapMap35. As described previously34 for the childhood asthma data set35, peripheral blood lymphocytes were transformed into lymphoblastoid cell lines for 830 parents and offspring from 206 families of European ancestry. Data from 405 children were used for the analysis as follows: using extracted RNA, gene expression was assessed with the Affymetrix HG-U133 Plus 2.0 chip. Genotyping was conducted using the Illumina Human-1 Beadchip and Illumina HumanHap300K Beadchip, and imputation performed using data from 1kGP. All SNPs selected for replication were tested for cis associations (defined as gene transcripts within 1 Mb), assuming an additive genetic model, adjusting for non-genetic effects in the gene expression value. Association testing was conducted using a variance component-based score test36 in MERLIN37, which accounts for the correlation between siblings. To gain insight into the relative importance of associations with our SNPs compared with other SNPs in the region, we also conducted conditional analyses, in which both the MZL SNP and the most significant SNP for the particular gene transcript (that is, peak SNP) were included in the same model. Only cis associations that reached P<6.8 × 10−5, which corresponds to a FDR of 1% are reported (Supplementary Table 9). The HapMap data set consisted of a publicly available RNAseq data set35 from transformed lymphoblastoid cell lines from 41 CEPH Utah residents with ancestry from northern and western Europe (HapMap-CEU) samples available from the Gene Expression Omnibus repository ( http://www.ncbi.nlm.nih.gov/geo) under accession number GSE16921. In this data set, we examined the association between the two reported SNPs in the HLA region, rs2922994 and rs9461741, as well as all SNPs in LD (r2>0.8 in HapMap-CEU release 28) and expression levels of probes within 1 Mb of the SNPs. As rs9461741 was not genotyped in HapMap, we selected rs7742033 as a proxy as it was the strongest linked SNP available in HapMap (r2=0.49 in 1kGP-CEU). Genotyping data for these HapMap-CEU individuals were directly downloaded from HapMap ( www.hapmap.org). Correlation between expression and genotype for each SNP-probe pair was tested using the Spearman’s rank correlation test with t-distribution approximation and estimated with respect to the minor allele in HapMap-CEU. P values were adjusted using the Benjamini–Hochberg FDR correction and eQTLs were considered significant at an FDR<0.05 (Supplementary Table 10).

Bioinformatics ENCODE and chromatin state dynamics

To assess chromatin state dynamics, we used Chromos38, which has precomputed data from ENCODE on nine cell types using Chip-Seq experiments39. These consist of B-lymphoblastoid cells (GM12878), hepatocellular carcinoma cells (HepG2), embryonic stem cells, erythrocytic leukaemia cells (hK562), umbilical vein endothelial cells, skeletal muscle myoblasts, normal lung fibroblasts, normal epidermal keratinocytes and mammary epithelial cells. These precomputed data have genome-segmentation performed using a multivariate hidden Markov model to reduce the combinatorial space to a set of interpretable chromatin states. The output from Chromos lists data into 15 chromatin states corresponding to repressed, poised and active promoters, strong and weak enhancers, putative insulators, transcribed regions and large-scale repressed and inactive domains (Supplementary Fig. 5).

Author contributions

J.Vijai, S.I.B., C.F.S., S.L.S., B.M.B., S.S.W., A.R.B.-W., Q.L., H.H., W.C., L.R.T., J.J.S., Y.Z., M.P.P., A.Z.-J., C.L., R.M., K.E.S., P.H., J.M., B.K.A., A.K., G.S., P.V., J.F.F., J.R.C., K.O., S.J.C., N.R. and A.N. organized and designed the study. J.Vijai, S.I.B., L.B., A.H., X.W., J.R.C., K.O., S.J.C. and N.R. conducted and supervised the genotyping of samples. J.Vijai, Z.W., S.I.B., C.F.S., S.d.S., L.C., P.I.W.d.B., J.G., M.Y., C.C.C., L.L., J.H., B.M., S.J.C. and N.R. contributed to the design and execution of statistical analysis. J.Vijai, Z.W., S.I.B., C.F.S., J.R.C., K.O., S.J.C., N.R. and A.N. wrote the first draft of the manuscript. J.Vijai, C.F.S., S.L.S., S.d.S., M.Melbye, B.G., P.M.B., L.C., B.M.B., S.S.W., A.R.B.-W., Q.L., R.C.H.V., C.P., S.M.A., B.K.L., J.R., K.E.N., J.G., H.H., W.C., N.B., L.R.T., J.J.S., J.T., Y.Z., M.P.P., G.G.G., R.S.K., A.Z.-J., M.G.E., A.Monnereau, K.A.B., D.A., T.L., D.J.V., A.Maria, M.C., T.T., A.J.N., A.D., M.L., C.A.T., T.E.W., T.M.H., G.J.W., M.T.S., E.A.H., R.D.J., L.F.T., Y.Y., H.-O.A., K.E.S., A.J.D.R., P.H., L.M.M., R.K.S., Y.B., P.Boffetta, P.Brennan, L.F., M.Maynadie, J.M., A.Staines, W.R.D., C.M.V., B.K.A., A.K., T.Z., T.R.H., G.S., P.V., G.M.F., R.R., L.M., J.C., E.G., P.K., J.Virtamo, A.Smith, E.K., E.R., B.C.H.C., J.F.F., X.W., J.R.C., K.O., N.R. and A.N. conducted the epidemiological studies and contributed samples to the GWAS and/or follow-up genotyping. All authors contributed to the writing of the manuscript.

Additional information

How to cite this article: Vijai, J. et al. A genome-wide association study of marginal zone lymphoma shows association to the HLA region. Nat. Commun. 6:5751 doi: 10.1038/ncomms6751 (2015).
  38 in total

1.  High incidence of primary gastric lymphoma in northeastern Italy.

Authors:  C Doglioni; A C Wotherspoon; A Moschini; M de Boni; P G Isaacson
Journal:  Lancet       Date:  1992-04-04       Impact factor: 79.321

2.  BTNL2, a butyrophilin-like molecule that functions to inhibit T cell activation.

Authors:  Thang Nguyen; Xikui K Liu; Yongliang Zhang; Chen Dong
Journal:  J Immunol       Date:  2006-06-15       Impact factor: 5.422

3.  Handling marker-marker linkage disequilibrium: pedigree analysis with clustered markers.

Authors:  Gonçalo R Abecasis; Janis E Wigginton
Journal:  Am J Hum Genet       Date:  2005-09-20       Impact factor: 11.025

4.  Principal components analysis corrects for stratification in genome-wide association studies.

Authors:  Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich
Journal:  Nat Genet       Date:  2006-07-23       Impact factor: 38.330

5.  A genome-wide association study of global gene expression.

Authors:  Anna L Dixon; Liming Liang; Miriam F Moffatt; Wei Chen; Simon Heath; Kenny C C Wong; Jenny Taylor; Edward Burnett; Ivo Gut; Martin Farrall; G Mark Lathrop; Gonçalo R Abecasis; William O C Cookson
Journal:  Nat Genet       Date:  2007-09-16       Impact factor: 38.330

6.  A new growth-regulated complementary DNA with the sequence of a putative trans-activating factor.

Authors:  D H Ku; C D Chang; J Koniecki; L A Cannizzaro; L Boghosian-Sell; H Alder; R Baserga
Journal:  Cell Growth Differ       Date:  1991-04

7.  Family-based association tests for genomewide association scans.

Authors:  Wei-Min Chen; Goncalo R Abecasis
Journal:  Am J Hum Genet       Date:  2007-09-18       Impact factor: 11.025

8.  Characterization of early stages of human B cell development by gene expression profiling.

Authors:  Marit E Hystad; June H Myklebust; Trond H Bø; Einar A Sivertsen; Edith Rian; Lise Forfang; Else Munthe; Andreas Rosenwald; Michael Chiorazzi; Inge Jonassen; Louis M Staudt; Erlend B Smeland
Journal:  J Immunol       Date:  2007-09-15       Impact factor: 5.422

9.  Helicobacter pylori infection and gastric lymphoma.

Authors:  J Parsonnet; S Hansen; L Rodriguez; A B Gelb; R A Warnke; E Jellum; N Orentreich; J H Vogelman; G D Friedman
Journal:  N Engl J Med       Date:  1994-05-05       Impact factor: 91.245

10.  Genome-wide association study identifies multiple susceptibility loci for diffuse large B cell lymphoma.

Authors:  James R Cerhan; Sonja I Berndt; Joseph Vijai; Hervé Ghesquières; James McKay; Sophia S Wang; Zhaoming Wang; Meredith Yeager; Lucia Conde; Paul I W de Bakker; Alexandra Nieters; David Cox; Laurie Burdett; Alain Monnereau; Christopher R Flowers; Anneclaire J De Roos; Angela R Brooks-Wilson; Qing Lan; Gianluca Severi; Mads Melbye; Jian Gu; Rebecca D Jackson; Eleanor Kane; Lauren R Teras; Mark P Purdue; Claire M Vajdic; John J Spinelli; Graham G Giles; Demetrius Albanes; Rachel S Kelly; Mariagrazia Zucca; Kimberly A Bertrand; Anne Zeleniuch-Jacquotte; Charles Lawrence; Amy Hutchinson; Degui Zhi; Thomas M Habermann; Brian K Link; Anne J Novak; Ahmet Dogan; Yan W Asmann; Mark Liebow; Carrie A Thompson; Stephen M Ansell; Thomas E Witzig; George J Weiner; Amelie S Veron; Diana Zelenika; Hervé Tilly; Corinne Haioun; Thierry Jo Molina; Henrik Hjalgrim; Bengt Glimelius; Hans-Olov Adami; Paige M Bracci; Jacques Riby; Martyn T Smith; Elizabeth A Holly; Wendy Cozen; Patricia Hartge; Lindsay M Morton; Richard K Severson; Lesley F Tinker; Kari E North; Nikolaus Becker; Yolanda Benavente; Paolo Boffetta; Paul Brennan; Lenka Foretova; Marc Maynadie; Anthony Staines; Tracy Lightfoot; Simon Crouch; Alex Smith; Eve Roman; W Ryan Diver; Kenneth Offit; Andrew Zelenetz; Robert J Klein; Danylo J Villano; Tongzhang Zheng; Yawei Zhang; Theodore R Holford; Anne Kricker; Jenny Turner; Melissa C Southey; Jacqueline Clavel; Jarmo Virtamo; Stephanie Weinstein; Elio Riboli; Paolo Vineis; Rudolph Kaaks; Dimitrios Trichopoulos; Roel C H Vermeulen; Heiner Boeing; Anne Tjonneland; Emanuele Angelucci; Simonetta Di Lollo; Marco Rais; Brenda M Birmann; Francine Laden; Edward Giovannucci; Peter Kraft; Jinyan Huang; Baoshan Ma; Yuanqing Ye; Brian C H Chiu; Joshua Sampson; Liming Liang; Ju-Hyun Park; Charles C Chung; Dennis D Weisenburger; Nilanjan Chatterjee; Joseph F Fraumeni; Susan L Slager; Xifeng Wu; Silvia de Sanjose; Karin E Smedby; Gilles Salles; Christine F Skibola; Nathaniel Rothman; Stephen J Chanock
Journal:  Nat Genet       Date:  2014-09-28       Impact factor: 41.307

View more
  30 in total

Review 1.  Familial predisposition and genetic risk factors for lymphoma.

Authors:  James R Cerhan; Susan L Slager
Journal:  Blood       Date:  2015-09-24       Impact factor: 22.113

2.  Cohort Profile: The MD Anderson Cancer Patients and Survivors Cohort (MDA-CPSC).

Authors:  Xifeng Wu; Michelle At Hildebrandt; Yuanqing Ye; Wong-Ho Chow; Jian Gu; Sonia Cunningham; Hua Zhao; Ernest T Hawk; Elizabeth Wagar; Alma Rodriguez; Stanley R Hamilton
Journal:  Int J Epidemiol       Date:  2015-12-20       Impact factor: 7.196

3.  DNArCdb: A database of cancer biomarkers in DNA repair genes that includes variants related to multiple cancer phenotypes.

Authors:  Pavel Silvestrov; Sarah J Maier; Michelle Fang; G Andrés Cisneros
Journal:  DNA Repair (Amst)       Date:  2018-07-31

Review 4.  Immune Dysfunction in Non-Hodgkin Lymphoma: Avenues for New Immunotherapy-Based Strategies.

Authors:  Lorenzo Falchi
Journal:  Curr Hematol Malig Rep       Date:  2017-10       Impact factor: 3.952

Review 5.  To Each Its Own: Linking the Biology and Epidemiology of NHL Subtypes.

Authors:  Jean L Koff; Dai Chihara; Anh Phan; Loretta J Nastoupil; Jessica N Williams; Christopher R Flowers
Journal:  Curr Hematol Malig Rep       Date:  2015-09       Impact factor: 3.952

Review 6.  Comparative oncology: what dogs and other species can teach us about humans with cancer.

Authors:  Joshua D Schiffman; Matthew Breen
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2015-07-19       Impact factor: 6.237

7.  Analysis of 153 115 patients with hematological malignancies refines the spectrum of familial risk.

Authors:  Amit Sud; Subhayan Chattopadhyay; Hauke Thomsen; Kristina Sundquist; Jan Sundquist; Richard S Houlston; Kari Hemminki
Journal:  Blood       Date:  2019-08-08       Impact factor: 22.113

Review 8.  Effective management strategies for patients with marginal zone lymphoma.

Authors:  Cecilia B Rosand; Kelly Valla; Christopher R Flowers; Jean L Koff
Journal:  Future Oncol       Date:  2017-12-20       Impact factor: 3.404

9.  The association between HLA and non-Hodgkin lymphoma subtypes, among a transplant-indicated population.

Authors:  Charlie Zhong; Loren Gragert; Martin Maiers; Brian T Hill; Jean Garcia-Gomez; Ketevan Gendzekhadze; David Senitzer; Joo Song; Dennis Weisenburger; Leanne Goldstein; Sophia S Wang
Journal:  Leuk Lymphoma       Date:  2019-06-19

Review 10.  Integrating understanding of epidemiology and genomics in B-cell non-Hodgkin lymphoma as a pathway to novel management strategies.

Authors:  Samantha Glass; Anh Phan; Jessica N Williams; Christopher R Flowers; Jean L Koff
Journal:  Discov Med       Date:  2016-03       Impact factor: 2.970

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.