Literature DB >> 24121790

A genome-wide association study identifies new susceptibility loci for esophageal adenocarcinoma and Barrett's esophagus.

David M Levine1, Weronica E Ek, Rui Zhang, Xinxue Liu, Lynn Onstad, Cassandra Sather, Pierre Lao-Sirieix, Marilie D Gammon, Douglas A Corley, Nicholas J Shaheen, Nigel C Bird, Laura J Hardie, Liam J Murray, Brian J Reid, Wong-Ho Chow, Harvey A Risch, Olof Nyrén, Weimin Ye, Geoffrey Liu, Yvonne Romero, Leslie Bernstein, Anna H Wu, Alan G Casson, Stephen J Chanock, Patricia Harrington, Isabel Caldas, Irene Debiram-Beecham, Carlos Caldas, Nicholas K Hayward, Paul D Pharoah, Rebecca C Fitzgerald, Stuart Macgregor, David C Whiteman, Thomas L Vaughan.   

Abstract

Esophageal adenocarcinoma is a cancer with rising incidence and poor survival. Most such cancers arise in a specialized intestinal metaplastic epithelium, which is diagnostic of Barrett's esophagus. In a genome-wide association study, we compared esophageal adenocarcinoma cases (n = 2,390) and individuals with precancerous Barrett's esophagus (n = 3,175) with 10,120 controls in 2 phases. For the combined case group, we identified three new associations. The first is at 19p13 (rs10419226: P = 3.6 × 10(-10)) in CRTC1 (encoding CREB-regulated transcription coactivator), whose aberrant activation has been associated with oncogenic activity. A second is at 9q22 (rs11789015: P = 1.0 × 10(-9)) in BARX1, which encodes a transcription factor important in esophageal specification. A third is at 3p14 (rs2687201: P = 5.5 × 10(-9)) near the transcription factor FOXP1, which regulates esophageal development. We also refine a previously reported association with Barrett's esophagus near the putative tumor suppressor gene FOXF1 at 16q24 and extend our findings to now include esophageal adenocarcinoma.

Entities:  

Mesh:

Year:  2013        PMID: 24121790      PMCID: PMC3840115          DOI: 10.1038/ng.2796

Source DB:  PubMed          Journal:  Nat Genet        ISSN: 1061-4036            Impact factor:   38.330


Esophageal adenocarcinoma is a cancer with rising incidence and poor survival. Most such cancers arise in a specialized intestinal metaplastic epithelium, which is diagnostic of Barrett’s esophagus. In a genome-wide association study, we compared esophageal adenocarcinoma cases (n=2,390) and patients with precancerous Barrett’s esophagus (n=3,175) with 10,120 controls in two phases. For the combined case group we identified three new associations. The first is on 19p13 (rs10419226, P=3.6×10−10) in CRTC1 (CREB-regulated transcription co-activator), whose aberrant activation has been associated with oncogenic activity. A second is on 9q22 (rs11789015, P=1.0×10−9) in BARX1, which encodes a transcription factor important in esophageal specification. A third is on 3p14 (rs2687201, P=5.5×10−9) near the transcription factor, FOXP1, which regulates esophageal development. We also refine a previously-reported association with Barrett’s esophagus near the putative tumor suppressor gene, FOXF1, on 16q24, and extend our findings to now include esophageal adenocarcinoma. A genetic component to the development of Barrett’s esophagus and esophageal adenocarcinoma has long been suspected based on prior studies in unrelated individuals and familial disease clusters.[1-9] This study leverages the resources of the Barrett’s and Esophageal Adenocarcinoma Consortium (BEACON), and combines high-quality largely population-based epidemiologic studies of esophageal adenocarcinoma and Barrett’s esophagus conducted over two decades. For the discovery phase, we used 1,516 EA cases, 2,416 Barrett’s esophagus cases and 3,209 controls, all of European ancestry, after rigorous quality control (QC) procedures (Online Methods) were applied to the genotyping data. The cases and 2,187 of the controls were collected by investigators in BEACON from cohort and case-control studies conducted in Western Europe, Australia, and North America. An additional 1,022 cancer-free controls were obtained from a study of melanoma[10] and included to increase statistical power. All cases were histologically confirmed. The distribution of samples by study is given in Supplementary Table 1 and their demographic characteristics in Supplementary Table 2. All samples were genotyped on the Illumina HumanOmni1-Quad platform. We performed association analyses on the 922,031 autosomal and X chromosome SNPs that passed QC using an additive logistic regression model implemented in GWASTools[11] including as covariates age, sex and the first four eigenvectors from principal component analysis (PCA). To assess variants not present on the Illumina HumanOmni1-Quad, we performed imputation for each region of interest (Tables 1 and 2). To identify shared genetic susceptibility loci for the two conditions, we treated the esophageal adenocarcinoma and Barrett’s esophagus cases as a single phenotype. The results are shown in a Manhattan plot in Figure 1. We also compared Barrett’s esophagus and esophageal adenocarcinoma separately against the controls (Supplementary Figure 1). QQ plots are shown in Supplementary Figure 2. The genomic inflation factor λ for analysis of the combined case group was 1.039 (1.084 excluding the first four principal components); no evidence for population stratification in our data set is indicated.
Table 1

Top five newly identified SNPs associated with Barrett’s esophagus (BE) and esophageal adenocarcinoma (EA). Shown are the discovery, replication and meta-analysis results for Barrett’s esophagus v. controls, esophageal adenocarcinoma v. controls and the combined Barrett’s esophagus and esophageal adenocarcinoma (BE+EA) v. controls. Each cell contains the association p-value, the odds ratio (OR) and 95% confidence interval (CI) for the minor allele, and the frequency of the minor allele in cases and controls. The slight variation in the number of discovery controls reflects the use of only unrelated samples for analysis, although six two-person families are present in the data set.

Cases/CntrlsSNPrs2687201rs11789015rs6479527rs10419226rs10423674
Chr.3991919
Position7092893096716028968584111880317218817903
CytoBand3p139q229q2219p1319p13
Nearest GeneFOXP1BARX1PTPDC1CRTC1CRTC1
Minor*/Major plus strand alleleT/GG/AT/CA/CT/G
Barrett’s EsophagusDiscovery2,416/3,206PDISCOVERY(BE)OR (95% CI)MAF Case/Cntrl2.27 × 10−51.19 (1.10 – 1.29)0.335/0.3002.34 × 10−40.85 (0.78 – 0.93)0.264/0.2929.16 × 10−50.86 (0.79 – 0.93)0.468/0.5052.32 × 10−81.24 (1.15 – 1.34)0.492/0.4426.84 × 10−60.83 (0.77 – 0.90)0.303/0.338
Replication759/6,911PREPLICATION(BE)OR (95% CI)MAF Case/Cntrl2.48 × 10−21.14 (1.02 – 1.28)0.336/0.3056.60 × 1030.84 (0.74 – 0.95)0.247/0.2791.29 × 10−10.92 (0.82 – 1.02)0.461/0.4811.22 × 10−11.09 (0.98 – 1.21)0.486/0.4625.67 × 10−20.89 (0.80 – 1.00)0.312/0.338
Meta-analysis3,175/10,117PMETA(BE)OR (95% CI)MAF Case/Cntrl2.00 × 10−61.18 (1.10 – 1.26)0.335/0.3035.08 × 10−60.85 (0.79 – 0.91)0.260/0.2834.74 × 10−50.88 (0.82 – 0.93)0.467/0.4895.54 × 10−81.19 (1.12 – 1.26)0.491/0.4561.92 × 10−60.85 (0.80 – 0.91)0.306/0.338
Esophageal Adeno-carcinomaDiscovery1,516/3,209PDISCOVERY(EA)OR (95% CI)MAF Case/Cntrl3.27 × 10−61.26 (1.14 – 1.39)0.343/0.3001.96 × 10−40.83 (0.75 – 0.91)0.257/0.2921.30 × 10−40.84 (0.77 – 0.92)0.463/0.5041.20 × 10−51.23 (1.12 – 1.34)0.486/0.4423.15 × 10−30.86 (0.78 – 0.95)0.316/0.338
Replication874/6,911PREPLICATION(EA)OR (95% CI)MAF Case/Cntrl2.31 × 10−21.14 (1.02 – 1.27)0.335/0.3052.11 × 10−40.80 (0.71 – 0.90)0.238/0.2794.43 × 10−30.86 (0.77 – 0.95)0.445/0.4811.20 × 10−21.14 (1.03 – 1.26)0.497/0.4627.59 × 10−50.80 (0.71 – 0.89)0.289/0.338
Meta-analysis2,390/10,120PMETA(EA)OR (95% CI)MAF Case/Cntrl5.76 × 10−71.20 (1.12 – 1.29)0.335/0.3031.80 × 10−70.81 (0.75 – 0.88)0.254/0.2832.03 × 10−60.85 (0.79 – 0.91)0.460/0.4898.35 × 10−71.19 (1.11 – 1.27)0.494/0.4561.46 × 10−60.83 (0.77 – 0.90)0.298/0.338
CombinedDiscovery3,928/3,207PDISCOVERY(BE+EA)OR (95% CI)MAF Case/Cntrl3.01 × 10−71.21 (1.12 – 1.30)0.338/0.3009.88 × 10−60.84 (0.78 – 0.91)0.261/0.2923.47 × 10−60.85 (0.80 – 0.91)0.466/0.5042.00 × 10−91.23 (1.15 – 1.32)0.490/0.4425.52 × 10−60.84 (0.79 – 0.91)0.308/0.338
Replication1,633/6,911PREPLICATION(BE+EA)OR (95% CI)MAF Case/Cntrl2.64 × 10−31.14 (1.05 – 1.24)0.336/0.3052.25 × 10−50.82 (0.75 – 0.90)0.242/0.2793.51 × 10−30.89 (0.82 – 0.96)0.452/0.4817.18 × 10−31.11 (1.03 – 1.20)0.492/0.4627.75 × 10−50.84 (0.77 – 0.92)0.300/0.338
Meta-analysis5,561/10,118PMETA(BE+EA)OR (95% CI)MAF Case/Cntrl5.47 × 10−91.18 (1.12 – 1.25)0.335/0.3031.02 × 10−90.83 (0.79 – 0.88)0.257/0.2835.84 × 10−80.87 (0.82 – 0.91)0.464/0.4893.55 × 10−101.18 (1.12 – 1.24)0.492/0.4561.75 × 10−90.84 (0.80 – 0.89)0.302/0.338

Coded allele.

Table 2

SNPs in region 16q24 on chromosome 16 near the FOXF1 gene associated with Barrett’s esophagus and esophageal adenocarcinoma. Shown are the discovery, replication and meta-analysis results for SNPs 100kb up or downstream of rs9936833, a previously identified susceptibility loci for Barrett’s esophagus.[14] This table contains rs9936833, SNPs that had more significant P-values than rs9936833 in the combined Barrett’s esophagus and esophageal adenocarcinoma data set (PDISCOVERY(BE+EA)), or were the most significant SNP in conditional analysis, in this region after fitting other SNPs (Table 3).

Cases/CntrlsSNPrs1490865rs3111601rs9936833rs1728400rs3950627rs2178146rs13332095
Position86387275864000818640311886434446864363438646369586465590
Minor*/Major plus strand alleleG/AG/AG/AA/CG/TG/AA/G
Barrett’s EsophagusDiscovery2,416/3,206PDISCOVERY(BE)OR (95% CI)MAF Case/Cntrl7.00 × 10−21.09 (0.99 – 1.19)0.258/0.2411.22 × 10−41.17 (1.08 – 1.27)0.332/0.3031.69 × 10−41.16 (1.07 – 1.26)0.397/0.3663.14 × 10−61.20 (1.11 – 1.30)0.473/0.4342.88 × 10−61.20 (1.11 – 1.30)0.504/0.4653.65 × 10−40.87 (0.80 – 0.94)0.396/0.4241.29 × 10−21.17 (1.03 – 1.33)0.105/0.093
Replication759/6,911PREPLICATION(BE)OR (95% CI)MAF Case/Cntrl5.01 × 10−11.04 (0.93 – 1.17)0.324/0.3133.67 × 10−10.95 (0.85 – 1.06)0.396/0.408
Meta-analysis3,175/10,117PMETA(BE)OR (95% CI)MAF Case/Cntrl4.24 × 10−41.13 (1.05 – 1.20)0.330/0.3106.24 × 10−40.89 (0.84 – 0.95)0.396/0.413
Esophageal Adeno-carcinomaDiscovery1,516/3,209PDISCOVERY(EA)OR (95% CI)MAF Case/Cntrl8.60 × 10−10.99 (0.89 – 1.10)0.243/0.2415.64 × 10−51.22 (1.11 – 1.35)0.343/0.3032.06 × 10−31.16 (1.05 – 1.27)0.397/0.3664.94 × 10−31.14 (1.04 – 1.25)0.460/0.4344.70 × 10−31.14 (1.04 – 1.25)0.494/0.4651.31 × 10−50.81 (0.74 – 0.89)0.380/0.4241.42 × 10−31.28 (1.10 – 1.48)0.112/0.093
Replication874/6,911PREPLICATION(EA)OR (95% CI)MAF Case/Cntrl1.67 × 10−11.08 (0.97 – 1.20)0.330/0.3134.22 × 10−20.90 (0.81 – 1.00)0.385/0.408
Meta-analysis2,390/10,120PMETA(EA)OR (95% CI)MAF Case/Cntrl8.49 × 10−51.16 (1.08 – 1.24)0.331/0.3104.37 × 10−60.85 (0.79 – 0.91)0.392/0.413
CombinedDiscovery3,928/3,207PDISCOVERY(BE+EA)OR (95% CI)MAF Case/Cntrl1.95 × 10−11.05 (0.97 – 1.14)0.252/0.2423.02 × 10−61.19 (1.11 – 1.28)0.336/0.3033.44 × 10−51.16 (1.08 – 1.24)0.397/0.3662.87 × 10−61.18 (1.10 – 1.26)0.468/0.4342.07 × 10−61.18 (1.10 – 1.26)0.500/0.4652.10 × 10−60.84 (0.79 – 0.91)0.390/0.4257.69 × 10−41.21 (1.08 – 1.36)0.108/0.093
Replication1,633/6,911PREPLICATION(BE+EA)OR (95% CI)MAF Case/Cntrl2.24 × 10−11.05 (0.97 – 1.15)0.327/0.3135.04 × 10−20.92 (0.85 – 1.00)0.390/0.408
Meta-analysis5,561/10,118PMETA(BE+EA)OR (95% CI)MAF Case/Cntrl1.56 × 10−51.13 (1.07 – 1.19)0.331/0.3101.14 × 10−60.88 (0.83 – 0.92)0.394/0.413

Coded allele.

Figure 1

Plot of genome-wide association results from the discovery data for the combined Barrett’s esophagus and esophageal adenocarcinoma cases using an additive logistic regression model with age, sex and the first four eigenvectors from principal components analysis as covariates. Results are shown for 3,928 cases (2414 Barrett’s esophagus, 1514 esophageal adenocarcinoma) and 3,207 controls for 801,552 autosomal and X chromosome SNPs that passed quality control and have a minor allele frequency > 1%. Chromosomes are delineated by alternating colors, as labeled on the x-axis. The y-axis shows the −log10 P-values.

We selected 94 associated (p < 10−4) SNPs for replication. Of these, 87 were genotyped in 874 histologically confirmed esophageal adenocarcinoma cases from the Stomach and Oesophageal Cancer Study (SOCS), 759 histologically confirmed Barrett’s esophagus cases from the UK Barrett’s Oesophagus Gene Study (UK Gene Study) and 6,911 controls, of which 1,711 were from the SEARCH Study and 5,200 from the Welcome Trust Case Control Consortium 2 (WTCCC2).[12] All SOCS, UK Gene Study and SEARCH samples self-identified as Caucasians and were genotyped on a Fluidigm 96.96 Dynamic Array IFC. WTCCC2 subjects were of European ancestry as determined by projection onto the first two principal components of a PCA of HapMap individuals and were genotyped on a custom version of the Illumina Human1.2M-Duo array. Replication analysis was done using an additive logistic regression model with sex as a covariate. METAL software [13] was used for meta-analysis of the discovery and replication data sets. The three loci that reached genome-wide significance (P < 5 × 10−8) in the combined case group meta-analysis are given in Table 1 and results for all replicated SNPs are given in Supplementary Table 3. The most strongly associated SNP for each of the three loci had similar ORs in Barrett’s esophagus and esophageal adenocarcinoma. None of the top imputed SNPs showed substantially stronger association than the genotyped SNPs. The most significant locus was at 19p13 (Figure 2a); rs10419226, PMETA(BE+EA) = 3.55×10−10, odds ratio (OR) = 1.18, 95% confidence interval (CI) = 1.12 – 1.24. Five imputed SNPs in high LD with rs10419226 (r[2] > 0.85) were genome-wide significant in the combined discovery data set and two were also significant in the Barrett’s esophagus discovery set (Supplementary Table 4). A second significant locus was at 9q22.32 (Figure 2b) for rs11789015 PMETA(BE+EA)= 1.02 × 10−9, OR(CI) = 0.83 (0.79–0.88). The third genome-wide significant locus was at 3p13 (rs2687201) near FOXP1 (Figure 2c) with PMETA(BE+EA)= 5.47 × 10−9, OR(CI) = 1.18 (1.12 – 1.25)
Figure 2

Regional association plots showing genotyped and imputed SNPs from the discovery data for the combined Barrett’s esophagus + esophageal adenocarcinoma cases for three newly discovered loci (a–c) and one previously identified locus (d). Genotyped SNPs are indicated by solid triangles, and imputed SNPs are indicated by hollow circles. The top-ranked SNP at each locus is shown as a solid purple diamond, except in (d) where it is rs9936833. SNPs are ordered by genomic location. The color scheme indicates linkage disequilibrium between the top ranked SNP and other SNPs in the region using the r2 value calculated from the 1000 genomes project. The y-axis is the −log10 p-value computed from 3,928 cases (2414 Barrett’s esophagus, 1514 esophageal adenocarcinoma) and 3,207 controls. Imputation P values for all SNPs are plotted. Note that imputed and genotyped P-values for genotyped SNPs differ slightly because for the imputed result, the analysis was based on dosage scores, whereas with genotyped SNPs, the hard genotype calls are used. The recombination rate from CEU HapMap data (right side y axis) is shown in light blue. (a) Chromosome 19p13 region. (b) Chromosome 9q22 region. (c) Chromosome 3p13 region. (d) Chromosome 16q24 region.

A previous study of Barrett’s esophagus identified the SNP rs9936833, near the putative tumor suppressor gene FOXF1.[14] A subset of the BEACON samples from the present study (2,398 Barrett’s esophagus cases, 2,167 controls) were used in the replication analysis of rs9936833 (P = 5.13 × 10−4, OR(CI) = 1.16 (1.07 – 1.27)). With the additional samples used here (18 Barrett’s esophagus cases and 1,042 controls) the p-value was more significant (PDISCOVERY(BE) = 1.69 × 10−4), but with no change in the odds ratio (OR(CI) = 1.16 (1.07 – 1.26)). This SNP was associated with esophageal adenocarcinoma (PDISCOVERY(EA) = 2.06 × 10−3, OR(CI) = 1.16 (1.05 – 1.27)) (Table 2). Examining the regional association plot of chromosome 16 near rs9936833 for the combined data (Figure 2d) we identified four nearby SNPs that had more significant P-values than rs9936833. To test the independence of these associations, this region was fine-mapped using conditional analysis to assess whether the associations were due to one or multiple variants. The discovery and replication results are shown in Table 2, with conditional analysis results in Table 3. Our results indicate that these markers define a set of complex susceptibility alleles; with between two and four independent loci.
Table 3

Step-wise conditional analysis to test for independent SNP signals in region 16q24. Each SNP is the most significant SNP 100kb up or downstream of rs9936833 after fitting all SNPs in the previous rows as additional covariates in the same logistic regression model used in the primary analysis. Starting with the most significant SNP in this region, rs3950627, this methodology fit four other SNPs (rows 2–5) before stopping when the p-value of the most significant remaining SNP had P > 0.01. The last row is the P-value of an association test of rs9936833 in each of these models.

Fitted SNP(s)rs3950627rs3950627 + rs2178146rs3950627 + rs2178146 + rs3111601rs3950627 + rs2178146 + rs3111601 + rs1490865
rsIDpositionp-valueOR (CI)p-valueOR (CI)p-valueOR (CI)p-valueOR (CI)p-valueOR (CI)
rs3950627864363430.00000211.18 (1.10 – 1.26)
rs2178146864636950.00000210.84 (0.79 – 0.91)0.001190.88 (0.82 – 0.95)
rs3111601864000810.00000301.19 (1.11 – 1.28)0.002251.13 (1.05 – 1.22)0.004781.12 (1.04 – 1.21)
rs1490865863872750.19496561.05 (0.97 – 1.14)0.024641.10 (1.01 – 1.19)0.025821.10 (1.01 – 1.19)0.001401.15 (1.05 – 1.25)
rs13332095864655900.00076951.21 (1.08 – 1.36)0.001471.20 (1.07 – 1.34)0.016581.15 (1.03 – 1.30)0.009901.17 (1.04 – 1.31)0.004901.18 (1.05 – 1.33)
rs9936833864031180.00003441.16 (1.08 – 1.24)0.015291.10 (1.02 – 1.19)0.022651.09 (1.01 – 1.18)0.849290.99 (0.86 – 1.14)0.695731.03 (0.89 – 1.19)
The previous study of Barrett’s esophagus also identified rs9257809 in the major histocompatibility complex (HLA) as being associated with increased risk of Barrett’s esophagus.[14] The significance in our extended data set (PDISCOVERY(BE) = 0.11, OR(CI) = 1.11 (0.99 – 1.29)) is not substantially different from the original replication results (P = 0.083, OR(CI) = 1.13 (0.98 – 1.30)). We found a similar OR for esophageal adenocarcinoma as in Barrett’s esophagus, suggesting the previously identified genome-wide significant Barrett’s esophagus association also plays a role in esophageal adenocarcinoma risk (PDISCOVERY(EA) = 0.09, OR(CI) = 1.14 (0.97 – 1.32); PDISCOVERY(BE+EA) = 0.06, OR(CI) = 1.12 (1.01 – 1.27)). A previous report using BEACON data showed that Barrett’s esophagus and esophageal adenocarcinoma risk is influenced by a large number of common genetic variants, and that a large proportion of the genes affecting risk of these two conditions are shared between them.[15] These findings informed our choice of analysis approach with a primary focus on the combined esophageal adenocarcinoma and Barrett’s esophagus samples. The utility of this approach was borne out in the results presented here. The ORs after meta-analyses comparing esophageal adenocarcinoma cases vs. controls and Barrett’s esophagus cases vs. controls for our top five loci are very similar, whereas direct comparison of the two case types revealed no significant differences (data not shown). Combining the two case types allows these SNPs to clearly achieve genome-wide significance in the combined data sets (Table 1). The SNPs in 19p13.11, rs10419226 and rs10423674, are intronic CRTC1 (CREB-regulated transcription co-activator) variants associated with oncogenic activity.[16] Phosphorylation of CRTC1 is regulated by the tumor suppressor kinase LKB1. Down-regulation or loss of LKB1 expression in human esophageal cancer cell lines and patient samples resulted in activated CRTC1 signaling and the transcriptional activation of downstream targets including LYPD3, which is associated with cancer metastasis leading to the promotion of esophageal cancer cell migration and invasion.[16] rs10419226 has been shown to be an expression quantitative trait loci (eQTL) for PIK3R2 in lymphoblastoid cell lines.[17] PIK3R2 is known to be involved in cancer[18] and is expressed in gastrointestinal tumors.[19] PIK3R2 is also known to interact with epidermal growth factor (EGF), which plays an important physiological role in the maintenance of esophageal and gastric tissue integrity. The biological effects of salivary EGF includes healing of oral and gastroesophageal ulcers and inhibition of gastric acid secretion.[20] Furthermore the EGF receptor has been found in gastrointestinal tissue and demonstrates increased expression in BE and esophageal adenocarcinoma.[21] The G/G genotype for the SNP EGF A61G is associated with a two- to four-fold increased risk of esophageal adenocarcinoma.[22,23] There are several SNPs in high LD (r[2] > 0.9) with rs10419226. Three of these, rs200331191, rs139340769 and rs8102046, lie in a region of probable promoter and enhancer activities across multiple cell lines.[24] The intronic CRTC1 SNP rs10423674 influences age at menarche.[25] The basis of this pleiotropic effect is unclear, but may be related to obesity as CRTC1−/− mice are hyperphagic and obese.[26] rs10423674 is an eQTL for PBX4 in lymphoblastoid cell lines. The nearest gene to the peak SNP on chromosome 3 (rs2687201) is FOXP1. The transcription factors FOXP1 and FOXP2 cooperatively regulate lung and esophagus development and FOXP1 is a therapeutic target in cancer.[27,28] The FOX family is overexpressed in esophageal cancer.[14] There are several SNPs in high LD (r2 > 0.8) with rs2687201 which lie within enhancer histone marks. One of them, rs7626449, is at a site where there is also evidence from DNase-seq for transcription factor binding in esophageal epithelial cells.[29] The SNP rs11789015 lies in an intron of BARX1, a homeobox transcription factor known to be involved in esophageal and trachea differentiation in developing mouse embryos and associated with down-regulation of Wnt pathway activity in stomach morphogenesis and specification.[30] The BARX1 promoter region is hypermethylated in gastric cancer (GC) cell lines and patient samples, with BARX1 mRNA expression in GC tissues and cell lines reduced.[31] rs11789015 lies in a region where histone marks denote likely promoter activity. rs11789015 also alters a known regulatory motif for the transcription factor FOXP1. A correlated SNP, rs62574346 (r2=0.97 with rs11789015), resides where there is also evidence from DNase-seq for transcription factor binding in esophageal epithelial cells.[29] A subset of the BEACON data presented here (Supplementary Figure 3a) formed part of the replication arm of a recent Barrett’s esophagus GWAS.[14] A primary finding from that work was a Barrett’s esophagus association at 16q24 SNP rs9936833. Here we found clear evidence this locus is also associated in esophageal adenocarcinoma (PDISCOVERY(EA) = 2.06 × 10−3, OR(CI) = 1.16 (1.05 – 1.27), as did a recent small study (316 esophageal adenocarcinoma cases, 602 controls; OR(CI) = 1.21 (0.99 – 1.47)).[32] Two other SNPs near rs9936833, rs2178146 and rs3111601, have stronger and more significant associations in esophageal adenocarcinoma cases (Table 2 and Supplementary Figure 3b). Since the size and direction of effect of the Barrett’s esophagus-associated SNPs at 16q24 were similar in esophageal adenocarcinoma, we used the combined Barrett’s esophagus and cancer data to identify other SNPs which are more significantly associated in 16q24 than rs9936833 (Table 2). One of these is rs3111601, which is in high LD (r2 = 0.75) with rs9936833. All of the SNPs in high LD with rs3111601 are intergenic, although rs1979654 (r2=0.64 with rs3111601) stands out as having excellent regulatory potential across a wide range of cell types and is likely to affect protein binding, chromatin structure and histone modification.[29] There was evidence for additional independent signals in the region at rs3950627 (38kb nearer to FOXF1) and rs2178146 (64kb nearer to FOXF1), where both had similar P-values to rs3111601 (Table 2 and Figure 2d), but were in only modest LD (r2 < 0.2) with rs3111601. Neither are good regulatory candidates, although rs8045253 (imputation association PDISCOVERY(BE+EA) = 8.04 × 10−5), in r2=0.63 with rs3950627, changes a motif for the transcription factor FOXP1. Changing the way that FOXP1 binds to this region is particularly interesting in light of our association findings on chromosome 3. In summary, we report the first genome-wide association study of esophageal adenocarcinoma, and the first to examine this cancer together with its precancerous lesion, Barrett’s esophagus. Consistent with our findings showing extensive polygenic overlap between esophageal adenocarcinoma and Barrett’s esophagus,[15] our most significant results were for cancer and pre-cancer combined. Together, these findings suggest that much of the genetic basis for esophageal adenocarcinoma lies in the development of Barrett’s esophagus, rather than progression from Barrett’s esophagus to esophageal adenocarcinoma. We found three novel genome-wide significant loci for esophageal adenocarcinoma and Barrett’s esophagus combined, and extended existing findings at the FOXF1 and HLA loci. One of the novel regions is chromosome 3p13, near FOXP1, a gene encoding a transcription factor which regulates esophageal development. Interestingly, two of the other regions (BARX1/9q22.32 and FOXF1/16q24.1) contain risk associated SNPs which disrupt binding of FOXP1. Further dissection of these loci is likely to lead to insights into the etiology of this rapidly fatal cancer.

Online Methods

DISCOVERY

Study subjects

Cases of Barrett’s esophagus and esophageal adenocarcinoma, together with associated population controls, were collected by investigators in the BEACON consortium. A subset of these individuals with European ancestry from epidemiologic studies conducted in Western Europe, Australia, and North America over the past twenty years were used in the Barrett’s and Esophageal Adenocarcinoma Genetic Susceptibility Study (BEAGESS) study. To increase the statistical power of the study we included additional controls from a hospital based case-control study of melanoma.[10] These controls (“MD Anderson controls”) were cancer-free friends or acquaintances of European ancestry who had accompanied melanoma patients to their clinical visits at the MD Anderson Cancer Center in Houston, Texas. The distribution of samples by study is given in Supplementary Table 1. Histological confirmation of esophageal adenocarcinoma was carried out for all esophageal adenocarcinoma studies. Similarly, Barrett’s esophagus was histologically confirmed via identification of goblet cells in metaplastic columnar epithelium in a biopsy taken from the esophagus. Age, sex (Supplementary Table 2) and other esophageal adenocarcinoma/Barrett’s esophagus risk factors were collected by all of the included studies via standardized questionnaires, usually through personal interviews. All recruited participants gave informed consent and this project was approved by the ethics boards of each participating institution.

Genotyping

BEAGESS specimens were shipped to the Fred Hutchinson Cancer Research Center (Seattle, WA) where they were processed and genotyped in three batches. In each batch, samples on each genotyping plate were stratified and balanced according to case/control status, study, and gender with samples assigned to plates randomly within those strata. Genotyping of DNA from buffy coat or whole blood was performed using the Illumina HumanOmni1-Quad platform. MD Anderson controls were genotyped using the Illumina HumanOmni1-Quad platform at the Johns Hopkins University Center for Inherited Disease Research (CIDR). SNP annotations were based on version H of the Illumina product files and corresponded to the Genome Reference Consortium GRCh37 release.

Quality control

Quality assurance and quality control (QA/QC) of the BEAGESS and MD Anderson data sets were carried out independently by the Genetics Coordinating Center at The University of Washington following standard procedures[33]. QA/QC of the MD Anderson data set was described previously[10]. BEAGESS samples with call rate < 95%, admixture of more than one DNA source, or unexpected relatedness (including unexpected duplicates) or misannotated sex that could not be explained were removed from the data set. We looked for batch and plate effects using intensity data and allelic frequency and checked for case control associations with different experimental factors. No important batch or plate effects or case control associations with experimental factors were found. We used heterozygosity, sex chromosome intensity data, identity by descent (IBD) analysis and visualization of B allele frequency (BAF) and log R ratio (LRR) plots to identify samples that had one or more of misannotated sex, unexpected relatedness or were sample mixtures.[34] Two additional sample mixtures were removed from the data set. After further sample filtering to keep only unrelated European ancestry samples (see next section), 2,416 Barrett’s esophagus cases, 1,516 esophageal adenocarcinoma cases and 2,187 controls remained. These samples were combined with 1,022 European ancestry controls from the MD Anderson data set for discovery analysis. SNPs were clustered using Illumina’s GenomeStudio software and defining SNP clusters using all samples with a call rate > 95%. SNPs that had either a GenTrain score < 0.6 or a cluster separation value < 0.4 had their genotypes set to missing. Additionally, we filtered SNPs that were intensity only, had a missing call rate > 5%, had a Hardy Weinberg equilibrium p-value over controls >= 10−4, had a discordance among any of the duplicate pairs, or had a Mendelian error in either BEAGESS families or HapMap trios. These filters were combined with similar filters calculated for the MD Anderson data set.[10] Additionally, we removed a further 344 SNPs that were discordant in the same HapMap control samples (n=3) run in both the BEAGESS and MD Anderson data sets. After QA/QC a total of 926,923 SNPs remained for analysis.

Principal Components Analysis

We performed Principal Component Analysis (PCA) as a two-step process using the SNPRelate software. First, we used PCA to define a homogeneous set of European ancestry samples in the BEAGESS data set. We did this by running PCA on a set of 6,248 unrelated (except for six two-person families) subjects each of which was an EA case, a BE case, or a control. A majority of these subjects (~98%) self-identified their race as “White” and a scatterplot of all subjects along the axes of the first two eigenvectors showed the majority of samples formed a tight cluster (Supplementary Figure 4). Therefore, we computed the means and standard deviations (SD) of the first two eigenvectors and defined any sample that fell within a two SD rectangle of both eigenvector means to be of homogeneous European ancestry (n=6,125). Second, we ran PCA on the combined data set (n=7,147) consisting of the BEAGESS European ancestry samples (n=6,125) and the similarly defined set of MD Anderson controls (n=1,022). The intent here was to identify eigenvectors to include as covariates in our model to adjust for population differences that were present in the remaining European-ancestry-only samples.[35,36] For this analysis we selected 65,774 SNPs that were non-monomorphic, autosomal, passed quality control, had missing call rate < 5%, minor allele frequency > 5%, did not have an LD value > 0.2 between any two SNPs in a sliding window of 500K bases, and were not in the LCT gene (2q21), HLA region or polymorphic regions on chromosomes eight (8p23) and seventeen (17q21.31). We included the first four eigenvectors as covariates in the association test model to account for population stratification by ancestry since they were significantly correlated with case-control status and a scree plot showed that the variance accounted for by each eigenvector flattened out after these four eigenvectors (data not shown). To check that only genome-wide variation was detected we computed the absolute value of the correlation coefficient of each eigenvector against the genotypes for each SNP. We observed one small region of high correlation (ρ = ~0.4) between eigenvector one with chromosome two which may indicate long-range LD with the LCT gene.

Statistical Analysis

After excluding six related samples and six other samples that had missing call rate > 2%, we ran a case-control analysis of the remaining 7,135 samples: 3,928 cases (1,514 EA and 2,414 BE) vs. 3,207 controls. We used an additive logistic regression model with case status regressed on each SNP’s genotype score (coded as 0, 1, or 2 for BB, AB and AA) including covariates age, sex and the first four PCA eigenvectors to compute the odds ratios (OR) and 95% confidence intervals (95% CI) relating risk of EA or BE to a given SNP variant. To test SNPs on the X chromosome, male genotypes were coded as 0 and 2 and female genotypes as 0, 1 and 2. After filtering SNPs that did not pass quality control and SNPs with a minor allele frequency < 1% the λ value was 1.04. The QQ plot is shown in Supplementary Figure 2. We also compared Barrett’s esophagus and esophageal adenocarcinoma cases separately against the controls using the same model. The corresponding Manhattan and QQ plots are shown in Supplementary Figures 1 and 2, respectively. Analysis was carried out in the R statistical programming language[37] using the Bioconductor packages GWASTools[11] and SNPRelate.[38] Using the combined BE + EA discovery data set we performed a step-wise series of nested logistic regression analyses to test the independence of the associations in 16q24 near rs9936833. We used the same logistic regression model and covariates as in our primary analysis, and also fitted rs3950627 as a covariate since it was the most significant SNP 100kb up or downstream of rs9936833. This conditional analysis identified rs2178146 as the most significant SNP 100kb up or downstream of rs9936833. We repeated this analysis four more times identifying and adding to each successive model rs2178146, rs3111601, rs1490865, and rs13332095, respectively, stopping when the p-value of the most significant remaining SNP had P > 0.01 (Table 3).

Imputation

To assess the impact of variants not present on the Illumina HumanOmni1-Quad, we imputed genotypes using the MaCH[39] software and a European reference panel from the 1000 Genomes project. At each region in Tables 1 and 2 imputation was done in two steps. First, haplotypes were estimated in a pre-phasing step. Second, missing alleles for additional SNPs were imputed onto these phased haplotypes using Minimac[39] and a publicly available reference panel of haplotypes from European ancestry populations. SNPs with very different allele frequencies (a Chi Square statistic > 40 in a test for difference in allele frequency) between the BEACON data and the reference panel were removed prior to the second step. SNPs with MACH imputation r2 < 0.3 (a measure of imputation quality) and SNPs with a minor allele frequency < 1% were also removed. Association analysis between imputed SNPs and disease status was performed using the same regression model as for genotyped SNPs, but with dosage probabilities as predictors, rather than the actual genotype calls. All association tests were two-sided. Linkage disequilibrium (LD) calculations (r2) were computed with the discovery data when the two SNPs being compared were both genotyped, otherwise the European ancestry samples from Phase 1 of the 1000 genomes project were used.

REPLICATION

SNP selection

We selected post-QA/QC SNPs for replication that had a p-value from discovery < 10−4 and a minor allele frequency > 1%. This yielded 406 SNPs: 179 from Barrett’s esophagus and esophageal adenocarcinoma vs. controls, 105 from esophageal adenocarcinoma vs. controls and 122 from Barrett’s esophagus vs. controls, of which 321 were unique. A subset of these SNPs (n=111) were selected via LD pruning with PLINK using the command clump[40]; if a SNP was in LD > 0.5 with any other SNP(s) in the list, the SNP with lowest p-value was selected for replication. For each of the ten SNPs on this list with the smallest p-value, we selected an extra “proxy” SNP to include in case the top SNP was not successfully genotyped in the replication set. These proxy SNPs were ones in high LD with the top SNP, but with a less significant p-value. We visually examined cluster plots of all SNPs and kept only those that were high quality. SNPs were rank ordered by p-value and replication attempted for the top 94.

Study Subjects

The replication cohort consisted of Barrett’s esophagus cases, esophageal adenocarcinoma cases and two control sets. Barrett’s esophagus cases were identified at endoscopy with a confirmed histopathological diagnosis of intestinal metaplasia from the UK Barrett’s Oesophagus Gene Study. esophageal adenocarcinoma cases were selected from the Stomach Oesophageal Cancer Study and had an ICD coding of malignant neoplasm of the esophagus (C15) and a pathological diagnosis of adenocarcinoma. One set of controls came from the SEARCH study which ascertains eligible cases of breast, ovary, prostate, colorectal, melanoma and endometrial cancer from the UK Eastern Cancer Registration and Information Centre. Controls were ascertained by frequency matching on age (five-year age bands) and sex to the esophageal adenocarcinoma and Barrett’s esophagus cases excluding individuals with a past history of cancer (excluding non-melanoma skin cancer). All recruited participants gave informed consent and the studies have been approved by the relevant institutional ethics review board. The other set of controls was from the Wellcome Trust Case Control Consortium 2 (WTCCC2). Barrett’s esophagus cases, esophageal adenocarcinoma cases and SEARCH controls were genotyped using the FluidigmTM high-throughput platform and Fluidigm 96.96 Dynamic ArraysTM according to the manufacturer’s instructions and read using the Fluidigm EP1TM. Each array is capable of running 96 samples against 96 SNP assays. Cases and controls were plated out in sets of 96 samples and combined into 384-well arrays for genotyping with the cases and controls mixed on each 384-well plate. Genotypes were automatically called using the BioMark Genotyping AnalysisTM software, but all cluster plots were also checked manually and adjusted as necessary. The WTCCC2 controls were genotyped on a custom version of the Illumina Human1.2M-Duo array.

Quality Control

We filtered Barrett’s esophagus cases, esophageal adenocarcinoma cases and SEARCH controls with low call rate, inconsistent gender, were duplicates or had self-reported ethnicity of “non-white” or “missing”. This left 759 Barrett’s esophagus cases, 874 esophageal adenocarcinoma cases and 1711 SEARCH controls. We excluded SNPs with missing call rate ≥ 5%, with significant differential missing call rates in cases and controls (p < 5×10−4), with low minor allele frequency (defined as <1%), and with significant departure from Hardy-Weinberg equilibrium (p<0.0005), leaving 87 post-QC SNPs. We applied standard sample and SNP exclusion criteria to the WTCCC2 controls keeping 5,190 post-QC European ancestry samples. There were 67 post-QC SNPs in the WTCCC2 controls in common with the 87-post QC SNPs from the BE cases, EA cases and SEARCH controls.

Statistical analysis

Each of the 87 SNPs was run using an additive logistic regression model with case status regressed on the SNPs genotype and including sex as a covariate. The analysis focus was on the comparison of BE and EA cases against controls, but we also ran each case type separately against the controls. The final data set used for replication consisted of 759 BE cases and 874 EA cases. For 67 SNPs the control set consisted of 6,911 samples; 1711 SEARCH controls and 5200 WTCCC2 controls. For 20 SNPs that were not genotyped in the WTCCC2 data only the 1,711 SEARCH controls were used. The R statistical programming language was used for all analyses.

META-ANALYSIS

We used the inverse variance-based method in the METAL software[13] to perform a meta-analysis of the discovery and replication data sets. This approach weights the effect size estimates (β-coefficients from the discovery and replication regression models) by their standard error estimates and calculates an overall Z-score and p-value. This was done separately for each sample set.

BIOINFORMATICS/FUNCTIONAL GENOMICS

Each region of interest was interrogated using the tools eQTL browser, HaploReg, RegulomeDB, and the UCSC Genome Browser.
  37 in total

1.  Familial clustering of reflux symptoms.

Authors:  N J Trudgill; K C Kapur; S A Riley
Journal:  Am J Gastroenterol       Date:  1999-05       Impact factor: 10.864

2.  Functional single-nucleotide polymorphism of epidermal growth factor is associated with the development of Barrett's esophagus and esophageal adenocarcinoma.

Authors:  Vivianda Menke; Raymond G J Pot; Leon M G Moons; Katinka P M van Zoest; Bettina Hansen; Herman van Dekken; Peter D Siersema; Johannes G Kusters; Ernst J Kuipers
Journal:  J Hum Genet       Date:  2011-12-01       Impact factor: 3.172

3.  High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping.

Authors:  Daniel A Peiffer; Jennie M Le; Frank J Steemers; Weihua Chang; Tony Jenniges; Francisco Garcia; Kirt Haden; Jiangzhen Li; Chad A Shaw; John Belmont; Sau Wai Cheung; Richard M Shen; David L Barker; Kevin L Gunderson
Journal:  Genome Res       Date:  2006-08-09       Impact factor: 9.043

4.  Principal components analysis corrects for stratification in genome-wide association studies.

Authors:  Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich
Journal:  Nat Genet       Date:  2006-07-23       Impact factor: 38.330

5.  PLINK: a tool set for whole-genome association and population-based linkage analyses.

Authors:  Shaun Purcell; Benjamin Neale; Kathe Todd-Brown; Lori Thomas; Manuel A R Ferreira; David Bender; Julian Maller; Pamela Sklar; Paul I W de Bakker; Mark J Daly; Pak C Sham
Journal:  Am J Hum Genet       Date:  2007-07-25       Impact factor: 11.025

Review 6.  Complex diseases in gastroenterology and hepatology: GERD, Barrett's, and esophageal adenocarcinoma.

Authors:  Rebecca C Fitzgerald
Journal:  Clin Gastroenterol Hepatol       Date:  2005-06       Impact factor: 11.382

7.  Identification of Barrett's esophagus in relatives by endoscopic screening.

Authors:  Amitabh Chak; Ashley Faulx; Margaret Kinnard; Wendy Brock; Joseph Willis; Georgia L Wiesner; Antonio R Parrado; Katrina A B Goddard
Journal:  Am J Gastroenterol       Date:  2004-11       Impact factor: 10.864

Review 8.  Iodine in evolution of salivary glands and in oral health.

Authors:  Sebastiano Venturi; Mattia Venturi
Journal:  Nutr Health       Date:  2009

Review 9.  FOXP1: a potential therapeutic target in cancer.

Authors:  Henry B Koon; Gregory C Ippolito; Alison H Banham; Philip W Tucker
Journal:  Expert Opin Ther Targets       Date:  2007-07       Impact factor: 6.902

10.  Barx1-mediated inhibition of Wnt signaling in the mouse thoracic foregut controls tracheo-esophageal septation and epithelial differentiation.

Authors:  Janghee Woo; Isabelle Miletich; Byeong-Moo Kim; Paul T Sharpe; Ramesh A Shivdasani
Journal:  PLoS One       Date:  2011-07-22       Impact factor: 3.240

View more
  94 in total

Review 1.  Genome-Wide Association Studies of Cancer in Diverse Populations.

Authors:  Sungshim L Park; Iona Cheng; Christopher A Haiman
Journal:  Cancer Epidemiol Biomarkers Prev       Date:  2017-06-21       Impact factor: 4.254

2.  Gastrointestinal dysfunction in autism displayed by altered motility and achalasia in Foxp1 +/- mice.

Authors:  Henning Fröhlich; Marie Luise Kollmeyer; Valerie Catherine Linz; Manuel Stuhlinger; Dieter Groneberg; Amelie Reigl; Eugen Zizer; Andreas Friebe; Beate Niesler; Gudrun Rappold
Journal:  Proc Natl Acad Sci U S A       Date:  2019-10-14       Impact factor: 11.205

3.  The Role of Gastroesophageal Reflux and Other Factors during Progression to Esophageal Adenocarcinoma.

Authors:  William D Hazelton; Kit Curtius; John M Inadomi; Thomas L Vaughan; Rafael Meza; Joel H Rubenstein; Chin Hur; E Georg Luebeck
Journal:  Cancer Epidemiol Biomarkers Prev       Date:  2015-04-30       Impact factor: 4.254

Review 4.  Recent developments in pathogenesis, diagnosis and therapy of Barrett's esophagus.

Authors:  Magnus Halland; David Katzka; Prasad G Iyer
Journal:  World J Gastroenterol       Date:  2015-06-07       Impact factor: 5.742

Review 5.  Molecular markers and imaging tools to identify malignant potential in Barrett's esophagus.

Authors:  Michael Bennett; Hiroshi Mashimo
Journal:  World J Gastrointest Pathophysiol       Date:  2014-11-15

Review 6.  Genetic Insights in Barrett's Esophagus and Esophageal Adenocarcinoma.

Authors:  Brian J Reid; Thomas G Paulson; Xiaohong Li
Journal:  Gastroenterology       Date:  2015-07-21       Impact factor: 22.682

7.  No Association Between Vitamin D Status and Risk of Barrett's Esophagus or Esophageal Adenocarcinoma: A Mendelian Randomization Study.

Authors:  Jing Dong; Puya Gharahkhani; Wong-Ho Chow; Marilie D Gammon; Geoffrey Liu; Carlos Caldas; Anna H Wu; Weimin Ye; Lynn Onstad; Lesley A Anderson; Leslie Bernstein; Paul D Pharoah; Harvey A Risch; Douglas A Corley; Rebecca C Fitzgerald; Prasad G Iyer; Brian J Reid; Jesper Lagergren; Nicholas J Shaheen; Thomas L Vaughan; Stuart MacGregor; Sharon Love; Claire Palles; Ian Tomlinson; Ines Gockel; Andrea May; Christian Gerges; Mario Anders; Anne C Böhmer; Jessica Becker; Nicole Kreuser; Rene Thieme; Tania Noder; Marino Venerito; Lothar Veits; Thomas Schmidt; Claudia Schmidt; Jakob R Izbicki; Arnulf H Hölscher; Hauke Lang; Dietmar Lorenz; Brigitte Schumacher; Rupert Mayershofer; Yogesh Vashist; Katja Ott; Michael Vieth; Josef Weismüller; Markus M Nöthen; Susanne Moebus; Michael Knapp; Wilbert H M Peters; Horst Neuhaus; Thomas Rösch; Christian Ell; Janusz Jankowski; Johannes Schumacher; Rachel E Neale; David C Whiteman; Aaron P Thrift
Journal:  Clin Gastroenterol Hepatol       Date:  2019-02-01       Impact factor: 11.382

Review 8.  Racial Disparity in Gastrointestinal Cancer Risk.

Authors:  Hassan Ashktorab; Sonia S Kupfer; Hassan Brim; John M Carethers
Journal:  Gastroenterology       Date:  2017-08-12       Impact factor: 22.682

9.  Diagnostic and Management Implications of Basic Science Advances in Barrett's Esophagus.

Authors:  Meghan Jankowski; Sachin Wani
Journal:  Curr Treat Options Gastroenterol       Date:  2015-03

10.  An APOE-independent cis-eSNP on chromosome 19q13.32 influences tau levels and late-onset Alzheimer's disease risk.

Authors:  Shuquan Rao; Mahdi Ghani; Zhiyun Guo; Yuetiva Deming; Kesheng Wang; Rebecca Sims; Canquan Mao; Yao Yao; Carlos Cruchaga; Dietrich A Stephan; Ekaterina Rogaeva
Journal:  Neurobiol Aging       Date:  2018-01-03       Impact factor: 4.673

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.