Literature DB >> 26098869

Common variation at 2p13.3, 3q29, 7p13 and 17q25.1 associated with susceptibility to pancreatic cancer.

Erica J Childs¹, Evelina Mocci², Daniele Campa³, Paige M Bracci⁴, Steven Gallinger⁵, Michael Goggins⁶, Donghui Li⁷, Rachel E Neale⁸, Sara H Olson⁹, Ghislaine Scelo¹⁰, Laufey T Amundadottir¹¹, William R Bamlet¹², Maarten F Bijlsma¹³, Amanda Blackford², Michael Borges⁶, Paul Brennan¹⁰, Hermann Brenner¹⁴, H Bas Bueno-de-Mesquita¹⁵, Federico Canzian¹⁶, Gabriele Capurso¹⁷, Giulia M Cavestro¹⁸, Kari G Chaffee¹², Stephen J Chanock¹¹, Sean P Cleary¹⁹, Michelle Cotterchio²⁰, Lenka Foretova²¹, Charles Fuchs²², Niccola Funel²³, Maria Gazouli²⁴, Manal Hassan⁷, Joseph M Herman²⁵, Ivana Holcatova²⁶, Elizabeth A Holly⁴, Robert N Hoover¹¹, Rayjean J Hung⁵, Vladimir Janout²⁷, Timothy J Key²⁸, Juozas Kupcinskas²⁹, Robert C Kurtz³⁰, Stefano Landi³¹, Lingeng Lu³², Ewa Malecka-Panas³³, Andrea Mambrini³⁴, Beatrice Mohelnikova-Duchonova³⁵, John P Neoptolemos³⁶, Ann L Oberg¹², Irene Orlow⁹, Claudio Pasquali³⁷, Raffaele Pezzilli³⁸, Cosmeri Rizzato¹⁶, Amethyst Saldia⁹, Aldo Scarpa³⁹, Rachael Z Stolzenberg-Solomon⁴⁰, Oliver Strobel⁴¹, Francesca Tavano⁴², Yogesh K Vashist⁴³, Pavel Vodicka⁴⁴, Brian M Wolpin⁴⁵, Herbert Yu⁴⁶, Gloria M Petersen¹², Harvey A Risch³², Alison P Klein⁴⁷.

Abstract

Pancreatic cancer is the fourth leading cause of cancer death in the developed world. Both inherited high-penetrance mutations in BRCA2 (ref. 2), ATM, PALB2 (ref. 4), BRCA1 (ref. 5), STK11 (ref. 6), CDKN2A and mismatch-repair genes and low-penetrance loci are associated with increased risk. To identify new risk loci, we performed a genome-wide association study on 9,925 pancreatic cancer cases and 11,569 controls, including 4,164 newly genotyped cases and 3,792 controls in 9 studies from North America, Central Europe and Australia. We identified three newly associated regions: 17q25.1 (LINC00673, rs11655237, odds ratio (OR) = 1.26, 95% confidence interval (CI) = 1.19-1.34, P = 1.42 × 10(-14)), 7p13 (SUGCT, rs17688601, OR = 0.88, 95% CI = 0.84-0.92, P = 1.41 × 10(-8)) and 3q29 (TP63, rs9854771, OR = 0.89, 95% CI = 0.85-0.93, P = 2.35 × 10(-8)). We detected significant association at 2p13.3 (ETAA1, rs1486134, OR = 1.14, 95% CI = 1.09-1.19, P = 3.36 × 10(-9)), a region with previous suggestive evidence in Han Chinese. We replicated previously reported associations at 9q34.2 (ABO), 13q22.1 (KLF5), 5p15.33 (TERT and CLPTM1), 13q12.2 (PDX1), 1q32.1 (NR5A2), 7q32.3 (LINC-PINT), 16q23.1 (BCAR1) and 22q12.1 (ZNRF3). Our study identifies new loci associated with pancreatic cancer risk.

Entities: CellLine Chemical Disease Gene Mutation Species

Mesh：

Year: 2015 PMID： 26098869 PMCID： PMC4520746 DOI： 10.1038/ng.3341

Source DB: PubMed Journal: Nat Genet ISSN： 1061-4036 Impact factor: 38.330

Introductory Paragraph

Pancreatic cancer is the fourth leading cause of cancer death in the developed world[1]. Both inherited high-penetrant mutations in BRCA2[2], ATM[3], PALB2[4], BRCA1[5], STK11[6], CDKN2A[7] and mismatch repair genes[8] as well as low-penetrant loci are associated with increased risk[9-12]. To identify novel loci, we performed a genome-wide association study on 9,925 pancreatic cancer cases and 11,569 controls, including 4,164 newly genotyped cases and 3,792 controls in 9 studies from North America, Central Europe and Australia. Three newly associated regions were identified: 17q25.1 (LINC00673, rs11655237, OR=1.26, 95%CI:1.19–1.34, P=1.42×10−14), 7p13 (SUGCT, rs17688601, OR=0.88, 95%CI:0.84–0.92, P=1.41×10−8), and 3q29 (TP63, rs9854771, OR=0.89, 95%CI:0.85–0.93, P=2.35×10−8). Significant association was detected on 2p13.3 (ETAA1, rs1486134, OR=1.14, 95%CI:1.09–1.19, P=3.36×10−9), a region with prior suggestive evidence in the Han Chinese[12]. We replicate previously reported associations at 9q34.2(ABO)[9], 13q22.1(KLF5)[10], 5p15.33 (TERT, CLPTM1)[10,11], 13q12.2 (PDX1)[11], 1q32.1(NR5A2)[10], 7q32.3(LINC-PINT)[11], 16q23.1(BCAR1)[11] and 22q12.1 (ZNRF3)[11]. Our study identifies novel loci associated with pancreatic cancer risk.

Main Text

We conducted a two-stage genome-wide association study (GWAS) of pancreatic cancer (Fig. 1). First, genome-wide genotyping of 8052 subjects from nine studies within the Pancreatic Cancer Case-Control Consortium (PanC4) (Supplementary Table 1) was conducted using the HumanOmniExpressExome-8v1 array. The overall study was approved by the Johns Hopkins Institutional Review Board (IRB). Each individual study obtained IRB approval from their parent institution. After quality control (Online Methods, Fig. 1, Supplementary Table 2), 7,956 individuals (4,164 cases and 3,792 controls) and 654,470 SNPs with call rates greater than 98% were analyzed. Unconditional logistic regression analysis adjusted for age and the first seven principal component eigenvectors was conducted under the log-additive genetic model (Fig. 2, Supplementary Fig. 1).

Figure 1

Overview of Stage 1 and Stage 2 analyses

Figure 2

Manhattan plot of PanC4 association analysis. Loci previously associated with pancreatic cancer in Caucasians are shown in black, 2p13.3 in blue and novel loci in red.

Analysis of 7,956 newly genotyped PanC4 individuals identified a novel locus at 17q25.1 (LINC00673, rs7214041, OR=1.38, 95%CI:1.26–1.51, P=1.95×10−10) significantly associated with pancreatic cancer risk (Table 1, column ‘PanC4’). In addition we replicate regions that had previously been reported to be associated with pancreatic cancer in the Caucasian population (Supplementary Table 3). These include: 9q34.2[9] (ABO, rs505922, OR=1.27, 95%CI:1.19–1.35, P=1.72×10−13), 13q22.1[10] (KLF5, rs9543325, OR=1.24, 95%CI:1.16–1.32, P=2.26×10−10), 5p15.33[10] (CLPTM1, rs401681, OR=1.2, 95% CI:1.13–1.28, P=2.7×10−8), 13q12.2[11] (PDX1, rs9581943, OR=1.17, 95%CI:1.10–1.24, P=1.94×10−7), 1q32.1[10] (NR5A2, rs3790844, OR=0.83, 95%CI:0.77–0.90, P=3.05×10−6), 7q32.3[11] (LINC-PINT, rs6971499, OR=0.81, 95%CI:0.74–0.88, P=7.1×10−6), 5p15.33[11] (TERT, rs2736098, OR=0.85, 95%CI: 0.78–0.93, P=2.31×10−5), 16q23.1[11] (BCAR1, rs7190458, OR=1.4, 95%CI=1.22–1.60, P=1.01×10−4), and 22q12.1[11] (ZNRF3, rs16986825, OR=1.14, 95% CI= 1.04–1.24, P= 2.72×10−3). In contrast, other than 2p13.3 (ETAA1, rs2035565, OR=1.15, 95%CI=1.07–1.25, P=2.69×10−4) (Supplementary Table 3) we observed no evidence of association (P>0.05) for SNPs previously reported to be associated (P<1×10−6) with pancreatic cancer in Asian populations[12,13]. While all ethnic groups were included in our analyses, over 92% of our study population reported Caucasian ancestry. We obtained similar results when analysis was limited to individuals reporting European ancestry. Because of limited sample sizes we did not conduct independent analysis of other ethnic groups(results not shown).

Table 1

Significant (P<5×10−8) association results for pancreatic cancer

			Stage 1				Stage 2
ChraSNP PositionbGene	Effect Allele (Minor)/Reference Allele	Statistic	PanC44,164 cases3,792 controls	PanScan 11,856 cases, 1,890 controls	PanScan 21,618 cases and 1,682 controls	Combined Stage 1c7,638 cases7,364 controls	PANDoRA2,497 cases4,611 controls	Combined Stage 1&2d9,925 cases11,569 controls
17q25.1hrs1165523770,400,166LINC00673	T/C	mafecases;controls	0.146; 0.110	0.139; 0.129	0.149; 0.116		0.135; 0.114
		infof	0.963	g	g
		OR (CI)g	1.38 (1.26 – 1.52)	1.09 (0.96 – 1.25)	1.34 (1.16 – 1.55)	1.27 (1.19 – 1.36)	1.24 (1.10 – 1.40)	1.26 (1.19 – 1.34)
		p-value	1.38×10⁻¹⁰	1.95×10⁻¹	2.95×10⁻⁴	6.74×10⁻¹²	6.40×10⁻⁴	1.42×10⁻¹⁴
17q25.1hrs721404170,401,476LINC00673	T/C	mafcases;controls	0.148; 0.112	0.140; 0.133	0.150; 0.117		0.139; 0.117
		info	g	0.966	0.96
		OR (CI)	1.38 (1.26 – 1.51)	1.07 (0.93 – 1.22)	1.33 (1.15 – 1.53)	1.26 (1.18 – 1.35)	1.25 (1.11 – 1.41)	1.26 (1.19 – 1.34)
		p-value	1.95×10⁻¹⁰	3.36×10⁻¹	3.69×10⁻⁴	2.67×10⁻¹¹	3.37×10⁻⁴	2.88×10⁻¹⁴
2p13.3rs148613467,639,769ETAA1 (2236bp 3′)	G/T	mafcases;controls	0.302; 0.275	0.305; 0.292	0.305; 0.276		0.292; 0.273
		info	g	g	g
		OR (CI)	1.14 (1.06 – 1.22)	1.06 (0.96 – 1.18)	1.15 (1.03 – 1.28)	1.13 (1.08 – 1.19)	1.16 (1.06 – 1.27)	1.14 (1.09 – 1.19)
		p-value	5.96×10⁻⁵	1.57×10⁻¹	5.18×10⁻³	8.35×10⁻⁷	9.42×10⁻⁴	3.36×10⁻⁹
7p13rs1768860140,866,663SUGCT	A/C	mafcases;controls	0.241; 0.263	0.218; 0.254	0.237; 0.268		0.254; 0.277
		info	g	g	g
		OR (CI)	0.89 (0.83 – 0.96)	0.82 (0.73 – 0.91)	0.85 (0.76 – 0.94)	0.87 (0.82 – 0.91)	0.91 (0.83 – 1.00)	0.88 (0.84 – 0.92)
		p-value	1.98×10⁻³	1.66×10⁻⁴	8.72×10⁻³	9.77×10⁻⁸	3.93×10⁻²	1.41×10⁻⁸
3q29rs9854771189,508,471TP63	A/G	mafcases;controls	0.328; 0.362	0.336; 0.366	0.325; 0.356		0.341; 0.356
		info	g	0.998	0.998
		OR (CI)	0.86 (0.81 – 0.92)	0.88 (0.80 – 0.90)	0.87 (0.79 – 0.97)	0.87 (0.83 – 0.92)	0.93 (0.86 – 1.01)	0.89 (0.85 – 0.93)
		p-value	3.10×10⁻⁵	7.94×10⁻³	1.55×10⁻²	4.08×10⁻⁸	1.01×10⁻¹	2.35×10⁻⁸

Cytogenetic regions according to NCBI Human Genome Build 37 and NCBI’s Map Viewer

SNP position according to NCBI Human Genome Build 37

Results from the Combined Stage 1 meta-analysis of PanC4, PanScan 1, and PanScan 2

Results from the Combined Stage 1 and 2 meta-analysis of PanC4, PanScan 1, PanScan 2, and PANDoRA

MAF- minor allele frequency

Quality of imputation metric. See online methods for more detail. If snp is genotyped and not imputed, a ‘g’ is reported

Allelic Odds Ratio and corresponding 95% Confidence Interval

R2>0.95

We then conducted a genome-wide meta-analysis of the PanC4 data with data from PanScan 1[9] and PanScan 2[10] (Combined Stage 1, Fig. 1). After quality control (Online Methods and Supplementary Table 4), we analyzed 528,179 SNPs and 3,746 individuals (1,856 cases and 1,890 controls) from PanScan 1 and 557,555 SNPs and 3,300 individuals (1,618 cases and 1,682 controls) from PanScan 2. Since the genotyping platforms differed across studies, missing genotypes were imputed using IMPUTE v2[14], with 1000 Genomes[15] (release Dec 2013) and HapMap3[16] (release #2,2009) as reference panels. For PanScan 1 and PanScan 2, we conducted association analysis using unconditional logistic regression including age and the first four principal components eigenvectors as covariates. Data from PanC4, PanScan 1, and PanScan 2 were combined (7,638 cases and 7,364 controls and 866,891 SNPs) and analyzed using a fixed-effects model implemented in METAL[17] (Fig. 3). A QQ plot (Supplementary Fig. 2) indicated appropriate control of type-1 errors, with λ values of 1.025 for PanC4, 0.998 for PanScan 1, and 1.017 for PanScan 2.

Figure 3

Manhattan plot of Combined Stage 1 association analysis. Loci previously associated with pancreatic cancer in Caucasians are shown in black, 2p13.3 in blue and novel loci in red.

The Combined Stage 1 Analysis (Table 1, column ‘Combined Stage 1’) yielded a second novel region of association at 3q29 (TP63, rs9854771, OR=0.87, 95%CI:0.83–0.92, P=4.08×10−8). A second SNP on 17q25.1 (rs11655237, OR=1.27, 95%CI:1.19–1.36, P=6.74×10−12), which is in high LD (r2=0.95) with rs7214041, also gave significant evidence of association in these combined data. We next conducted a Stage 2 analysis in an independent set of 2,497 cases and 4,611 controls from the PANDoRA consortium[18]. We selected twenty-five SNPs from 23 independent regions (Supplementary Table 5) with p-values below 10−5 in either PanC4 or the Combined Stage 1 analyses. When multiple SNPs per region were associated, the most significant SNP was selected; SNPs on 17q25.1 and 2p13.3 were exceptions. After quality control (Online Methods, Supplementary Table 6), 2,287 cases and 4,205 controls from the PANDoRA study were analyzed. Age-adjusted association analyses by country were carried out, and results were combined using a fixed-effect model. We observed independent evidence of association at 17q25.1 in the PANDoRA study (Table 1, column ‘PANDoRA’: rs7214041, OR=1.25, 95%CI:1.11–1.41, P=3.37×10−4). Combined analysis of the Stage 1 and 2 data for the 25 SNPs (Table 1 and Supplementary Table 5, column ‘Combined Stage 1&2’) revealed two additional significantly associated loci: 2p13.3(ETAA1, rs1486134, OR=1.14, 95%CI:1.09–1.19, P=3.36×10−9) and 7p13(SUGCT, rs17688601, OR=0.88, 95%CI:0.84–0.92, P=1.41×10−8). Promising signals (Supplementary Table 7) arose at 18q21.2(GRP, rs1517037, OR=0.87, 95%CI:0.83–0.92, P=3.17×10−7), 12q24.31(HNF1A, rs7310409, OR= 1.11, 95%CI:1.06–1.15, P=6.34×10−7), 1p13.1(WNT2B, rs351365, OR=0.89, 95%CI:0.85–0.93, P=7.39×10−7), and 20q13.11 (rs6073450, OR=1.11, 95%CI:1.06–1.15, P=9.21×10−7). We identified and replicated a novel region for association on 17q25.1 (Fig. 4a). Two highly correlated variants (rs11655237 and rs7214041, r2=0.95) were associated with pancreatic cancer risk. Variant rs7214041 is to LINC00673 (long inter-genic non-protein coding RNA 673). rs11655237, a non-coding transcript variant, shows significant DNase hypersensitivity in multiple cancer cell lines and binds transcription factors including P300, FOXA1, FOXA2, and the DNA repair protein RAD21 according to HaploReg v2[19]. HaploRegV2 also indicated rs7214041 alters regulatory motifs for HNF1[19]. Interestingly, we also found suggestive evidence of an association with rs7310409 located at the HNF1A locus (12q24.31, Supplementary Fig. 3a and Supplementary Table 7). A recent study of the pancreatic cancer transcriptome suggests HNF1A may act as a tumor suppressor in pancreatic cancers[20]. Variation in HNF1A has been associated with risk of Type 2 diabetes[21,22], a well-established risk factor for pancreatic cancer[23-25], and maturity onset diabetes of the young (MODY)[26]. Furthermore, variants in HNF1A (in particular rs7310409) and HNF4A were identified as risk factors for pancreatic cancer in pathway-based and candidate-SNP-based analyses of the PanScan data[27,28].

Figure 4

Regional association and linkage disequilibrium (LD) plots for four novel genome-wide significant loci: (a) 17q25.1, (b) 3q29, (c) 2p13.3, and (d) 7p13. Association p-values are shown for three analyses: PanC4 only (black circles), Combined Stage 1 (PanC4, PanScan 1, and PanScan 2) (grey circles), and Combined Stage 1 and 2 (PanC4, PanScan 1, PanScan 2, and PANDoRA) (red circles). LD plots are based on 1000 Genomes European samples.

We also identified significant association for two variants in high LD (rs9854771 and rs1515496, r2=0.99) located in an intron of TP63 on 3q29 (Fig. 4b). p63 is a p53 homologue implicated in tumorigenesis and metastasis[29] by playing a role in cell-cycle arrest and apoptosis. Overexpression of p63 can mimic p53 activation in certain experimental models[30]. Interestingly, different isoforms of p63 have opposing effects; TAp63 has tumor suppressive effects while DNp63 has oncogenic effects[31]. Danilov and colleagues suggested DNp63α was the predominant isoform in pancreatic cancer cell lines and promoted pancreatic cancer growth, motility and invasion[32]. Previous GWAS studies of lung cancer and bladder cancer have demonstrated significant evidence of association for SNPs in TP63[33-37]. HaploReg query of this region showed that both are predicted to be conserved elements via GERP, suggesting functional roles. Our analysis revealed genome-wide significance in a region on 2p13.3 (rs1486134). A pancreatic cancer GWAS in Han Chinese subjects[12] found suggestive evidence for another SNP on 2p13.3 (rs2035565) (Supplementary Table 3). High LD is present throughout this region (Fig. 4c), including strong LD between rs1486134 and rs2035565 in European and Asian populations based on 1000 Genomes[15] samples (r2=0.91 and r2=0.90 respectively). This region includes the gene ETAA1 (Ewing tumor-associated antigen 1), alias ETAA16, that may function as a tumor-specific cell surface antigen in the Ewing’s family of tumors[38]. We observed significant association on 7p13 for rs17688601, located in an intron of the SUGCT (succinyl-CoA:glutarate-CoA transferase) gene (alias c7orf10) (Fig. 4d). This variant is predicted in HaploREGV2 to alter binding of HNF1-4 and other DNA binding proteins[19]. The SUGCT protein is involved in glutarate metabolism and mutations in this gene are associated with glutaric aciduria[39]. While there is evidence of altered tricarboxylic acid cycle metabolism in pancreatic cancer[40], the role of this gene in pancreatic cancer risk is unclear. Combined Stage 1 and Stage 2 identified suggestive evidence of association (P<1×10−6) in four regions: 12q24.31(HNF1A) (Supplementary Fig. 3a), 18q21.2(GRP) (Supplementary Fig. 3b), 1p13.1(5′ of WNT2B) (Supplementary Fig. 3c), and 20q13.11 (Supplementary Fig. 3d). GRP (gastrin releasing peptide) production has been associated with pancreatic tumor growth in vitro[41]. WNT signaling plays an important role in pancreas development. WNT2B (Wingless-Type MMTV Integration Site Family, Member 2B) is overexpressed in pancreatic cancer and has been associated with decreased survival[42]. The 20q13.11 variant is located ~20kb of the HNF4A (MODY) gene, mutations of which are associated with early-onset diabetes[43]. In the PanC4 study we observed 11 SNPs on chromosome 9q31.3 (Supplementary Fig. 3e) in moderate to high LD (r2 values between 0.6 and 1) with p-values from 7.00×10−8 to 2.73×10−6, including rs10991043 (OR=1.19, 95%CI:1.12–1.26, P=7.00×10−8) nearby the SMC2 (structural maintenance of chromosome 2) gene. This gene plays an important role in DNA repair in humans. While there was no evidence of association in the other study populations examined, the strong signal across multiple SNPs in PanC4 suggest that this region merits further investigation. Further functional characterization of these associated regions is needed, including examining if these SNPs are functional through eQTL. Performing eQTL analysis of pancreatic tissues is challenging. Normal pancreatic tissue is primarily comprised of acinar cells (>90%), but pancreatic ductal adenocarcinoma has a ductal phenotype, and the appropriate normal tissue to analyze is debatable because the cell of origin of pancreatic ductal adenocarcinomas is debated. eQTL analysis of pancreatic tumor tissue is also problematic because the tumor tissue of a pancreatic ductal adenocarcinomas contains a variable mixture of cell types including fibroblasts, multiple types of immune cells, non-neoplastic pancreatic cells and cancer cells, with cancer cells representing only a minority of the total cell population. Furthermore, gene expression analysis of normal pancreatic tissue is often limited by the RNA degradation associated with high level RNAase expression in pancreatic acinar cells. An ideal study of pancreatic eQTLs for pancreatic cells would take into account these challenges. Smoking is a well-established risk factor for pancreatic cancer[44-47]. For all nine SNPs identified in Table 1 and Supplementary Table 7, we conducted an analysis stratified by smoking status (ever smoker vs. never smoker) in PanC4 samples. No qualitative differences in effect size between current smokers and never smokers were observed (results not shown). Furthermore, when we included an interaction term in the model, this term was not significant at the 0.05 level. We estimated the heritability of pancreatic cancer due to common GWAS SNPs using data from PanC4 samples of Caucasian ancestry using only directly genotyped SNPs(3,828 cases, 3,551 controls and 620,357 SNPs) as well as the combined dataset (7,032 cases 6,866 controls 268,681 SNPs). Using a disease prevalence of 0.0149, reflecting the lifetime risk of pancreatic cancer, we estimated that 16.4% (95%CI: 10.4%–22.4%) in PanC4 and 13.1% (95%CI 9.9%–16.3%) for the combined dataset of the total phenotypic variation was explained by genome-wide common SNPs. The established associated regions (loci in Table 1 and Supplemental Table 3), accounted for 3.0% (95%CI: 2.0%-3.9%) and 2.1%(95%CI 1.7%-3.1%) of the total phenotypic variation in the Panc4 population and the combined dataset, respectively. We identified several novel regions involved in pancreatic cancer susceptibility, and provided additional evidence to support many of the established associations. While it is of interest that many of these highly associated variants are located in the introns of genes, these associations could be due to more distant genomic effects. Follow-up studies, including functional studies, are needed to fully understand how these variants (either directly or indirectly) impact risk of pancreatic cancer. Our work highlights the importance of common variation in pancreatic cancer risk.

Online Methods

Stage 1 Methods

PanC4 Quality Control

In total, 8052 individuals were selected for genotyping from studies participating in the Pancreatic Cancer Case-Control Consortium (PanC4). Participating sites included: The Central Europe study coordinated by the International Agency for Research on Cancer (IARC/Central Europe)[48], Johns Hopkins Hospital[49,50], Mayo Clinic[51], MD Anderson Cancer Center[52], Memorial Sloane-Kettering Cancer Center[53], University of Toronto[54], Queensland[55], University of California San Francisco (UCSF)[56], and Yale University[57] (Supplementary Table 1). Cases were defined as individuals with adenocarcinoma of the pancreas. DNA samples from these individuals from PanC4, 180 study duplicates, 176 HapMap control samples, and 26 replicates from the previous pancreatic cancer GWAS PanScan 2[10], were genotyped on the IlluminaHumanOmniExpressExome-8v1 array at the Johns Hopkins Center for Inherited Disease Research (CIDR). Genotypes were called using GenomeStudio version 2011.1, Genotyping Module 1.9.4 and GenTrain version 1.0. Genotyping results were inspected for quality by assessing the missing call rate, allelic imbalance, heterozygosity, discordance in reported versus genotyped gender, relatedness, ancestry and chromosomal anomalies. Unexpected relatedness between pairs of samples was assessed using the method of moments[58] implemented in SNPRelate[59]. The median genotype call rate was 99.9%, with all individuals having a call rate greater than 98%. After removing individuals with excessive allele sharing, duplicates and subjects with incomplete information on age, 7,956 subjects (4,164 cases and 3,792 controls) were available for statistical analyses (Supplementary Table 2). SNPs with the following characteristics were excluded from statistical analyses: positional duplicates, more than two discordant calls in study duplicates, technical failures or missing call rate greater than 2%, more than one Mendelian error in HapMap control trios, Hardy-Weinberg equilibrium p-value<10−6, sex difference in allele frequency greater than 0.2 for autosomes/XY in samples of European ancestry, and minor allele frequencies (MAF) less than 0.005. Overall 654,470 SNPs passed the quality control filters applied; the median missing call rate was 0.024% and 98% of SNPs had a missing call rate less than 1% (Supplementary Table 2).

PanScan 1 and PanScan 2 Quality Control

PanScan 1 and PanScan 2 data were obtained from dbGAP[60,61] (dbGaP study accession: phs000206.v4.p3). Data from all participating sites apart from Group Health (which required a separate data sharing agreement) were included in the analysis. Previously published PanScan 1[9] and PanScan 2[10] studies used the Illumina HumanHap550 and Illumina Human 610-Quad chips respectively. Quality control was performed as described above for PanC4. Forty-five unexpected duplicates between PanScan 1, PanScan 2, and PanC4 were identified and removed from analyses of the PanScan datasets. After data cleaning, 528,179 SNPs and 3,746 individuals (1,856 cases, 1,890 controls) remained in PanScan 1, and 557,555 SNPs and 3,300 individuals (1,618 cases and 1,682 controls) remained in PanScan 2 (See Supplementary Table 4).

Association Analysis

To investigate population structure, principal components analysis (PCA) was conducted separately for PanC4, PanScan 1 and PanScan 2 using SNPRelate[59] (Supplementary Fig. 4). Genotype imputation was performed separately for PanScan 1, PanScan 2 and PanC4 using IMPUTE v2[14]. Since PanScan 1 and PanScan 2 SNPs were originally mapped using an older genome assembly (NCBI build 36), we converted their genome position to genome assembly NCBI build 37 by using LIFTOVER. Markers not identified in the build 37 assembly were removed. To decrease computational time, we pre-phased genotypes to produce best-guess haplotypes using SHAPEIT v2 software[62]. Both 1000 Genomes[15] Phase I-integrated haplotypes (release Dec 2013) and HapMap3[16] (release #2,2009) were used as reference panels during imputation. After imputation, SNPs with quality scores < 0.3 were excluded from all subsequent analysis. Only SNPs directly genotyped in either PanC4, PanScan 1, or PanScan 2 and passing quality control filters were retained for analysis. This resulted in 866,891 SNPs in the Combined Stage 1 analysis. The expected genotype counts were then analyzed using the frequentist test option of SNPTEST[63]. Decade of age and eigenvectors from PCA were included as covariates. The number of eigenvectors to include was chosen based on inspection of the scree plot and p-values from association between eigenvectors and pancreatic cancer status. The results from each study were then combined using a fixed-effects inverse standard error approach implemented by METAL[17] (see Supplementary Table 5, column ‘Combined Stage 1’). Test statistic inflation (λ), was estimated to be 1.025 for PanC4, 0.998 for PanScan 1, and 1.017 for PanScan 2. Test statistics for PanC4 and PanScan 2 were adjusted to account for small amounts of population stratification using METAL’s genomic control option. Our sample size gives us over 80% power to detect an odds ratio of 1.2 for SNPs with a minor allele frequency greater than 0.20. Manhattan and QQ plots of PanC4 GWAS are shown in Fig. 2 and Supplementary Fig. 1, respectively. Manhattan and QQ plots showing association results from the Combined Stage 1 analysis are shown in Fig. 3 and Supplementary Fig. 2. To examine whether our association results were confounded by population stratification, we conducted a secondary analysis, restricting our samples to those of European ancestry based on a PCA analysis performed with PanC4 and Hapmap3 samples. The loci identified through association testing did not change, and their odds ratios and p-values did not vary significantly (results not shown).

Stage 2 Methods

PANDoRA Replication Study

Twenty-five SNPs from 23 independent regions identified as showing evidence of association (P<1×10−5) in either the PanC4 analysis or the Combined Stage 1 analysis, were genotyped in samples from the PANcreatic Disease ReseArch (PANDoRA)[18] consortium with TaqMan technology. These samples were drawn from case-control studies in six European countries: Czech Republic, Germany, Greece, Italy, Lithuania, and Poland. In total, 2497 cases with pancreatic cancer and 4611 controls were genotyped. 8% of samples were duplicated and overall concordance was >99%. Supplementary Table 6 shows the features of the PANDoRA dataset. Samples missing more than 2 SNPs (~15%) or missing covariate information were excluded from analyses. In total, 2,287 cases and 4,205 controls from the PANDoRA study remained after quality control. Because PANDoRA is a collection of samples from various centers, we analyzed each country separately. Logistic regression models with additive effects of each allele were fit, as implemented in PLINK[64] (Supplementary Table 5, column ‘PANDoRA’). Two SNPs, rs16867971 for Greece and rs10850078 for Lithuania, showed evidence of departure from HWE in controls (P<0.001). The SNP violating HWE was not analyzed for that country. A final fixed-effects meta-analysis of PanC4, PanScan 1, PanScan 2, and PANDoRA (Combined Stage 1 and 2 analysis) was then conducted using METAL[17] on the 25 SNPs chosen for inclusion in Stage 2. Results are in Supplementary Table 5, column ‘Combined Stage 1&2’. To further interrogate these regions we examined all 1000G imputed SNPs (as described above) in regions with significant or suggestive (P<1×10−6) evidence of association (Fig. 4 and Supplemental Fig. 3). In our Combined Stage 1 analysis (Supplementary Table 5, column ‘Combined Stage 1’), two SNPs selected for replication were observed to have heterogeneity p-values below 0.001. When restricting our analysis to individuals of European ancestry, heterogeneity p-values from the meta-analysis remained virtually unchanged, implying that the heterogeneity was not due to population stratification. One SNP, rs16867971, was directly genotyped in PanC4 and imputed in PanScan 1 and PanScan 2. We found evidence of association for rs16867971 in PanC4, but not in PanScan 1 or PanScan 2 (P>0.05). The second SNP, rs6706539, was also directly genotyped in PanC4 and imputed in PanScan 1 and PanScan 2. For rs6706539, allele A was associated with increased risk in PanScan 1 and PanScan 2 (P=0.008 and P=0.04, respectively) but protective in PanC4 (P=3.4×10−6). Upon examination of the imputed and non-imputed SNPs adjacent to rs6706539, we found that r2 values between this SNP and other SNPs within 10kb were low, ranging from 0.05 to 0.18 in 1000 Genomes CEU samples. It is possible that this low LD made imputation of this SNP rather difficult. Additionally, inspection of the alleles coded as reference and alternate alleles for this SNP in PanC4 and 1000 Genomes suggests that this oddity is not due to differences in strand alignment. Forest plots of our top hits (Supplemental Fig. 5a–5i) showing the magnitude of odds ratios for each study population show that in the majority of the “top” SNPs, those with p-value <1×10−6, the direction of the effect was consistent across populations. Additionally, none of the top SNPs showed significant heterogeneity (at the 0.05 level) in the Combined Stage 1 and Stage 2 analysis. However, in many instances the effect size in the PanScan I population was smaller than the association observed in PanScan II, PanC4 and PANDoRA. We conducted a random effects meta-analysis as well and overall the results were consistent with the results observed from the fixed-effect meta-analysis (results not shown).

Heritability Analysis

Heritability analysis was performed using GCTA[65] software. This analysis estimates the percentage of phenotypic variance explicated by common SNPs. We assumed a prevalence of 0.0149 (risk to age 90 in the US Caucasian population; SEER data collected in 2009–2011). We excluded individuals not clustering with HapMap[16] CEU (CEPH- Utah residents with ancestry from northern and western Europe) samples in PCA analysis as well as individuals with estimated relationships > 0.05 or missing genotype rate >0.01. SNPs with missing rate>0.05, MAF <0.01 and HWE p-value<5×10−4 were also excluded. We estimated the overall heritability in the PanC4 study using SNP data, as well as the heritability attributed to the 12 regions with significant evidence of association in the Caucasian population plus the 6 suggestive regions identified.

HaploReg

HaploReg is a tool used for exploring functional annotations of non-coding variants. For each variant and region identified in this study, we used HaploReg to gain insight into functional annotations including chromatin state (promoters and enhancers), conserved regions, variant effect on regulatory motifs and protein binding sites. Regions were defined by SNPs with r2>0.8 to the associated SNP.

63 in total

1. Variation in TP63 is associated with lung adenocarcinoma susceptibility in Japanese and Korean populations.

Authors: Daiki Miki; Michiaki Kubo; Atsushi Takahashi; Kyong-Ah Yoon; Jeongseon Kim; Geon Kook Lee; Jae Ill Zo; Jin Soo Lee; Naoya Hosono; Takashi Morizono; Tatsuhiko Tsunoda; Naoyuki Kamatani; Kazuaki Chayama; Takashi Takahashi; Johji Inazawa; Yusuke Nakamura; Yataro Daigo
Journal: Nat Genet Date: 2010-09-26 Impact factor: 38.330

2. Transcriptome analysis of pancreatic cancer reveals a tumor suppressor function for HNF1A.

Authors: Jason W Hoskins; Jinping Jia; Marta Flandez; Hemang Parikh; Wenming Xiao; Irene Collins; Mickey A Emmanuel; Abdisamad Ibrahim; John Powell; Lizhi Zhang; Nuria Malats; William R Bamlet; Gloria M Petersen; Francisco X Real; Laufey T Amundadottir
Journal: Carcinogenesis Date: 2014-09-18 Impact factor: 4.944

3. Pathway analysis of genome-wide association study data highlights pancreatic development genes as susceptibility factors for pancreatic cancer.

Authors: Donghui Li; Eric J Duell; Kai Yu; Harvey A Risch; Sara H Olson; Charles Kooperberg; Brian M Wolpin; Li Jiao; Xiaoqun Dong; Bill Wheeler; Alan A Arslan; H Bas Bueno-de-Mesquita; Charles S Fuchs; Steven Gallinger; Myron Gross; Patricia Hartge; Robert N Hoover; Elizabeth A Holly; Eric J Jacobs; Alison P Klein; Andrea LaCroix; Margaret T Mandelson; Gloria Petersen; Wei Zheng; Ilir Agalliu; Demetrius Albanes; Marie-Christine Boutron-Ruault; Paige M Bracci; Julie E Buring; Federico Canzian; Kenneth Chang; Stephen J Chanock; Michelle Cotterchio; J Michael Gaziano; Edward L Giovannucci; Michael Goggins; Göran Hallmans; Susan E Hankinson; Judith A Hoffman Bolton; David J Hunter; Amy Hutchinson; Kevin B Jacobs; Mazda Jenab; Kay-Tee Khaw; Peter Kraft; Vittorio Krogh; Robert C Kurtz; Robert R McWilliams; Julie B Mendelsohn; Alpa V Patel; Kari G Rabe; Elio Riboli; Xiao-Ou Shu; Anne Tjønneland; Geoffrey S Tobias; Dimitrios Trichopoulos; Jarmo Virtamo; Kala Visvanathan; Joanne Watters; Herbert Yu; Anne Zeleniuch-Jacquotte; Laufey Amundadottir; Rachael Z Stolzenberg-Solomon
Journal: Carcinogenesis Date: 2012-04-20 Impact factor: 4.944

4. Cancer Incidence in BRCA1 mutation carriers.

Authors: Deborah Thompson; Douglas F Easton
Journal: J Natl Cancer Inst Date: 2002-09-18 Impact factor: 13.506

5. Role of p63 in Development, Tumorigenesis and Cancer Progression.

Authors: Johann Bergholz; Zhi-Xiong Xiao
Journal: Cancer Microenviron Date: 2012-07-31

6. Allergies, variants in IL-4 and IL-4R alpha genes, and risk of pancreatic cancer.

Authors: Sara H Olson; Irene Orlow; Jennifer Simon; Diana Tommasi; Pampa Roy; Sharon Bayuga; Emmy Ludwig; Ann G Zauber; Robert C Kurtz
Journal: Cancer Detect Prev Date: 2007-11-26

7. C7orf10 encodes succinate-hydroxymethylglutarate CoA-transferase, the enzyme that converts glutarate to glutaryl-CoA.

Authors: Simon Marlaire; Emile Van Schaftingen; Maria Veiga-da-Cunha
Journal: J Inherit Metab Dis Date: 2013-07-27 Impact factor: 4.982

8. A genome-wide association study identifies pancreatic cancer susceptibility loci on chromosomes 13q22.1, 1q32.1 and 5p15.33.

Authors: Gloria M Petersen; Laufey Amundadottir; Charles S Fuchs; Peter Kraft; Rachael Z Stolzenberg-Solomon; Kevin B Jacobs; Alan A Arslan; H Bas Bueno-de-Mesquita; Steven Gallinger; Myron Gross; Kathy Helzlsouer; Elizabeth A Holly; Eric J Jacobs; Alison P Klein; Andrea LaCroix; Donghui Li; Margaret T Mandelson; Sara H Olson; Harvey A Risch; Wei Zheng; Demetrius Albanes; William R Bamlet; Christine D Berg; Marie-Christine Boutron-Ruault; Julie E Buring; Paige M Bracci; Federico Canzian; Sandra Clipp; Michelle Cotterchio; Mariza de Andrade; Eric J Duell; J Michael Gaziano; Edward L Giovannucci; Michael Goggins; Göran Hallmans; Susan E Hankinson; Manal Hassan; Barbara Howard; David J Hunter; Amy Hutchinson; Mazda Jenab; Rudolf Kaaks; Charles Kooperberg; Vittorio Krogh; Robert C Kurtz; Shannon M Lynch; Robert R McWilliams; Julie B Mendelsohn; Dominique S Michaud; Hemang Parikh; Alpa V Patel; Petra H M Peeters; Aleksandar Rajkovic; Elio Riboli; Laudina Rodriguez; Daniela Seminara; Xiao-Ou Shu; Gilles Thomas; Anne Tjønneland; Geoffrey S Tobias; Dimitrios Trichopoulos; Stephen K Van Den Eeden; Jarmo Virtamo; Jean Wactawski-Wende; Zhaoming Wang; Brian M Wolpin; Herbert Yu; Kai Yu; Anne Zeleniuch-Jacquotte; Joseph F Fraumeni; Robert N Hoover; Patricia Hartge; Stephen J Chanock
Journal: Nat Genet Date: 2010-01-24 Impact factor: 38.330

9. Diabetes mellitus as a risk factor for pancreatic cancer. A meta-analysis.

Authors: J Everhart; D Wright
Journal: JAMA Date: 1995 May 24-31 Impact factor: 56.272

Review 10. p63 is a suppressor of tumorigenesis and metastasis interacting with mutant p53.

Authors: G Melino
Journal: Cell Death Differ Date: 2011-07-15 Impact factor: 15.828

115 in total

1. Genetic and Circulating Biomarker Data Improve Risk Prediction for Pancreatic Cancer in the General Population.

Authors: Brian M Wolpin; Peter Kraft; Jihye Kim; Chen Yuan; Ana Babic; Ying Bao; Clary B Clish; Michael N Pollak; Laufey T Amundadottir; Alison P Klein; Rachael Z Stolzenberg-Solomon; Pari V Pandharipande; Lauren K Brais; Marisa W Welch; Kimmie Ng; Edward L Giovannucci; Howard D Sesso; JoAnn E Manson; Meir J Stampfer; Charles S Fuchs
Journal: Cancer Epidemiol Biomarkers Prev Date: 2020-04-22 Impact factor: 4.254

2. Characterising cis-regulatory variation in the transcriptome of histologically normal and tumour-derived pancreatic tissues.

Authors: Mingfeng Zhang; Soren Lykke-Andersen; Bin Zhu; Wenming Xiao; Jason W Hoskins; Xijun Zhang; Lauren M Rost; Irene Collins; Martijn van de Bunt; Jinping Jia; Hemang Parikh; Tongwu Zhang; Lei Song; Ashley Jermusyk; Charles C Chung; Bin Zhu; Weiyin Zhou; Gail L Matters; Robert C Kurtz; Meredith Yeager; Torben Heick Jensen; Kevin M Brown; Halit Ongen; William R Bamlet; Bradley A Murray; Mark I McCarthy; Stephen J Chanock; Nilanjan Chatterjee; Brian M Wolpin; Jill P Smith; Sara H Olson; Gloria M Petersen; Jianxin Shi; Laufey Amundadottir
Journal: Gut Date: 2017-06-20 Impact factor: 23.059

Review 3. Genome-Wide Association Studies of Cancer in Diverse Populations.

Authors: Sungshim L Park; Iona Cheng; Christopher A Haiman
Journal: Cancer Epidemiol Biomarkers Prev Date: 2017-06-21 Impact factor: 4.254

4. Long Noncoding RNA LINC00673 Is Activated by SP1 and Exerts Oncogenic Properties by Interacting with LSD1 and EZH2 in Gastric Cancer.

Authors: Mingde Huang; Jiakai Hou; Yunfei Wang; Min Xie; Chenchen Wei; Fengqi Nie; Zhaoxia Wang; Ming Sun
Journal: Mol Ther Date: 2017-02-15 Impact factor: 11.454

Review 5. Inherited pancreatic cancer.

Authors: Fei Chen; Nicholas J Roberts; Alison P Klein
Journal: Chin Clin Oncol Date: 2017-12

6. Potential functional variants in SMC2 and TP53 in the AURORA pathway genes and risk of pancreatic cancer.

Authors: Yun Feng; Hongliang Liu; Bensong Duan; Zhensheng Liu; James Abbruzzese; Kyle M Walsh; Xuefeng Zhang; Qingyi Wei
Journal: Carcinogenesis Date: 2019-06-10 Impact factor: 4.944

7. Agnostic Pathway/Gene Set Analysis of Genome-Wide Association Data Identifies Associations for Pancreatic Cancer.

Authors: Naomi Walsh; Han Zhang; Paula L Hyland; Qi Yang; Evelina Mocci; Mingfeng Zhang; Erica J Childs; Irene Collins; Zhaoming Wang; Alan A Arslan; Laura Beane-Freeman; Paige M Bracci; Paul Brennan; Federico Canzian; Eric J Duell; Steven Gallinger; Graham G Giles; Michael Goggins; Gary E Goodman; Phyllis J Goodman; Rayjean J Hung; Charles Kooperberg; Robert C Kurtz; Núria Malats; Loic LeMarchand; Rachel E Neale; Sara H Olson; Ghislaine Scelo; Xiao O Shu; Stephen K Van Den Eeden; Kala Visvanathan; Emily White; Wei Zheng; Demetrius Albanes; Gabriella Andreotti; Ana Babic; William R Bamlet; Sonja I Berndt; Ayelet Borgida; Marie-Christine Boutron-Ruault; Lauren Brais; Paul Brennan; Bas Bueno-de-Mesquita; Julie Buring; Kari G Chaffee; Stephen Chanock; Sean Cleary; Michelle Cotterchio; Lenka Foretova; Charles Fuchs; J Michael M Gaziano; Edward Giovannucci; Michael Goggins; Thilo Hackert; Christopher Haiman; Patricia Hartge; Manal Hasan; Kathy J Helzlsouer; Joseph Herman; Ivana Holcatova; Elizabeth A Holly; Robert Hoover; Rayjean J Hung; Vladimir Janout; Eric A Klein; Robert C Kurtz; Daniel Laheru; I-Min Lee; Lingeng Lu; Núria Malats; Satu Mannisto; Roger L Milne; Ann L Oberg; Irene Orlow; Alpa V Patel; Ulrike Peters; Miquel Porta; Francisco X Real; Nathaniel Rothman; Howard D Sesso; Gianluca Severi; Debra Silverman; Oliver Strobel; Malin Sund; Mark D Thornquist; Geoffrey S Tobias; Jean Wactawski-Wende; Nick Wareham; Elisabete Weiderpass; Nicolas Wentzensen; William Wheeler; Herbert Yu; Anne Zeleniuch-Jacquotte; Peter Kraft; Donghui Li; Eric J Jacobs; Gloria M Petersen; Brian M Wolpin; Harvey A Risch; Laufey T Amundadottir; Kai Yu; Alison P Klein; Rachael Z Stolzenberg-Solomon
Journal: J Natl Cancer Inst Date: 2019-06-01 Impact factor: 13.506

8. Identification of genetic variants in m⁶A modification genes associated with pancreatic cancer risk in the Chinese population.

Authors: Pingting Ying; Yao Li; Nan Yang; Xiaoyang Wang; Haoxue Wang; Heng He; Bin Li; Xiating Peng; Danyi Zou; Ying Zhu; Rong Zhong; Xiaoping Miao; Jianbo Tian; Jiang Chang
Journal: Arch Toxicol Date: 2021-01-21 Impact factor: 5.153

9. Association of Common Susceptibility Variants of Pancreatic Cancer in Higher-Risk Patients: A PACGENE Study.

Authors: Erica J Childs; Kari G Chaffee; Steven Gallinger; Sapna Syngal; Ann G Schwartz; Michele L Cote; Melissa L Bondy; Ralph H Hruban; Stephen J Chanock; Robert N Hoover; Charles S Fuchs; David N Rider; Laufey T Amundadottir; Rachael Stolzenberg-Solomon; Brian M Wolpin; Harvey A Risch; Michael G Goggins; Gloria M Petersen; Alison P Klein
Journal: Cancer Epidemiol Biomarkers Prev Date: 2016-05-16 Impact factor: 4.254

10. Invited Commentary: E Pluribus Unum for Epidemiology.

Authors: Sophia S Wang; James V Lacey
Journal: Am J Epidemiol Date: 2015-12-10 Impact factor: 4.897