Literature DB >> 25990418

A new GWAS and meta-analysis with 1000Genomes imputation identifies novel risk variants for colorectal cancer.

Nada A Al-Tassan¹, Nicola Whiffin², Fay J Hosking², Claire Palles³, Susan M Farrington⁴, Sara E Dobbins², Rebecca Harris⁵, Maggie Gorman³, Albert Tenesa⁶, Brian F Meyer¹, Salma M Wakil¹, Ben Kinnersley², Harry Campbell⁷, Lynn Martin³, Christopher G Smith⁵, Shelley Idziaszczyk⁵, Ella Barclay³, Timothy S Maughan⁸, Richard Kaplan⁹, Rachel Kerr¹⁰, David Kerr¹¹, Daniel D Buchanan¹², Daniel D Buchannan, Aung Ko Win¹³, John Hopper¹³, Mark Jenkins¹³, Noralane M Lindor¹⁴, Polly A Newcomb¹⁵, Steve Gallinger¹⁶, David Conti¹⁷, Fred Schumacher¹⁷, Graham Casey¹⁷, Malcolm G Dunlop⁴, Ian P Tomlinson³, Jeremy P Cheadle⁵, Richard S Houlston².

Abstract

Genome-wide association studies (GWAS) of colorectal cancer (CRC) have identified 23 susceptibility loci thus far. Analyses of previously conducted GWAS indicate additional risk loci are yet to be discovered. To identify novel CRC susceptibility loci, we conducted a new GWAS and performed a meta-analysis with five published GWAS (totalling 7,577 cases and 9,979 controls of European ancestry), imputing genotypes utilising the 1000 Genomes Project. The combined analysis identified new, significant associations with CRC at 1p36.2 marked by rs72647484 (minor allele frequency [MAF] = 0.09) near CDC42 and WNT4 (P = 1.21 × 10(-8), odds ratio [OR] = 1.21 ) and at 16q24.1 marked by rs16941835 (MAF = 0.21, P = 5.06 × 10(-8); OR = 1.15) within the long non-coding RNA (lncRNA) RP11-58A18.1 and ~500 kb from the nearest coding gene FOXL1. Additionally we identified a promising association at 10p13 with rs10904849 intronic to CUBN (MAF = 0.32, P = 7.01 × 10(-8); OR = 1.14). These findings provide further insights into the genetic and biological basis of inherited genetic susceptibility to CRC. Additionally, our analysis further demonstrates that imputation can be used to exploit GWAS data to identify novel disease-causing variants.

Entities: CellLine Chemical Disease Gene Mutation Species

Mesh：

Substances：

Year: 2015 PMID： 25990418 PMCID： PMC4438486 DOI： 10.1038/srep10442

Source DB: PubMed Journal: Sci Rep ISSN： 2045-2322 Impact factor: 4.379

Twin studies indicate that heritable factors account for 35% of the variation in risk of developing colorectal cancer (CRC)1. However, only 5% of CRC can be attributed to the inheritance of high-penetrance mutations in the known genes23. Genome-wide association studies (GWAS) conducted primarily in European456789101112 but also Asian13141516 populations have vindicated the long-held belief that part of the heritable risk of CRC is attributable to common, low-risk variants. These GWAS have provided insights into the biological basis of CRC, highlighting the role of genes within the bone morphogenetic protein signalling pathway (BMP2, BMP4, GREM1 and SMAD7)411 and some candidate genes (e.g. CDH1/CDH3), as well as genes not previously implicated in CRC (e.g. POLD3, TERC, CDKN1A and SHROOM2)56. Despite the success of GWAS the risk SNPs so far identified in European populations account for only 8% of the familial CRC risk (Supplementary Table 1). Together with the over-representation of association signals in GWAS strongly suggests that additional risk SNPs remain to be discovered. The statistical power of individual GWAS is limited by the modest effect sizes of the genetic variants and the requirement for a stringent threshold to establish statistical significance in order to avoid type 1 errors. Meta-analysis of GWAS data therefore offers the opportunity to identify new CRC risk loci and provide further insights into tumour biology. Furthermore, imputation of untyped variants in GWAS data using publicly available reference datasets increases the number of variants that can be tested for an association with CRC risk. To identify new CRC susceptibility loci, we conducted an independent primary scan of CRC using patient samples from the COIN trial and performed a genome-wide meta-analysis with five previously published GWAS. To recover untyped genotypes, thereby maximising the prospects of identifying risk variants, we imputed over 10 million SNPs in the six GWAS datasets, using data from the 1000 Genomes Project17 as reference (see Materials & Methods for details).

Methods

Primary GWAS

The COIN GWAS was based on 2,244 CRC cases (64% male, mean age 61 years, SD = 10) ascertained through two independent Medical Research Council clinical trials of advanced/metastatic CRC; COIN and COIN-B18. Cases were genotyped using Affymetrix Axiom Arrays according to the manufacturer’s recommendations (Affymetrix, Santa Clara, CA 95051, USA), using duplicate samples and sequencing of significantly associated SNPs in a subset of samples to confirm genotyping accuracy. For all SNPs >99% concordant results were obtained. For controls, we made use of Wellcome Trust Case Control Consortium 2 (WTCCC2) Affymetrix 6.0 array data on 2,674 individuals from the UK Blood Service Control Group. Individuals were excluded with: <95% successfully genotyped SNPs (n = 122), discordant sex information (n = 8), classed as out of bounds by Affymetrix (n = 30), duplication or cryptic relatedness (identity by descent >0.185, n = 4), evidence of non-white European ancestry using PCA in conjunction with HapMap samples (n = 130; cut-off based on the minimum and maximum values of the top two principal components of the controls; Supplementary Figure 2). The details of all sample exclusions are provided in Supplementary Figure 3. We excluded SNPs from the analysis with: call rate <95%; different missing genotype rate between cases and controls at P < 10−5; MAF < 0.01; departure from Hardy–Weinberg equilibrium in controls at P < 10−5. The adequacy of the case–control matching and the possibility of differential genotyping of cases and controls were assessed using quantile-quantile (Q–Q) plots of test statistics.

Statistical and bioinformatic analysis

Analyses were undertaken using R(v3.02)24 and PLINK25 software. The association between each SNP and the risk of CRC was assessed by the Cochran–Armitage trend test. ORs and associated 95% CIs were calculated by unconditional logistic regression. Phasing of GWAS SNP genotypes was performed using SHAPEIT(v2.644)26. Prediction of the untyped SNPs was carried out using IMPUTE(v2.3.0)27 based on the data from the 1000 Genomes Project (Phase 1 integrated variant set, v3.20101123)28 as reference. Imputed data were analyzed using SNPTEST(v2.4.1)29. Association meta-analyses only included markers with info scores >0.4, imputed call rates/SNP > 0.9 and MAFs > 0.01. The fidelity of imputation, as assessed by the concordance between imputed and sequenced SNPs, was examined in a subset of 200 UK cases. Meta-analyses were carried out using META(v2.4-1)30, under an inverse-weighted fixed-effects model using the genotype probabilities from IMPUTE, where a SNP was not directly typed. We calculated Cochran’s Q statistic to test for heterogeneity and the I statistic to quantify the proportion of the total variation that was caused by heterogeneity –I values ≥75% are considered characteristic of large heterogeneity31. Associations by sex, age and clinico-pathological phenotypes were examined by logistic regression in case-only analyses. The familial relative risk of CRC attributable to each variant was calculated as detailed by Pharoah et al.32 assuming the overall familial risk of CRC, as shown in epidemiological studies, is 2.233. To explore epigenetic profiles of association signals, we used ChromHMM34. States were inferred from ENCODE Histone Modification data on the CRC cell line HCT116 (DNAse, H3K4me3, H3K4me1, H3K27ac, Pol2 and CTCF)35 binarized using a multivariate Hidden Markov Model. To examine whether any of the SNPs or their proxies (i.e. r2 > 0.8 in 1000genomes CEU reference panel) annotate putative transcription factor binding/enhancer elements we used the CADD (combined annotation dependent depletion) web-server36. We assessed sequence conservation using: PhastCons (<0.3 indicative of conservation), Genomic Evolutionary Rate Profiling37 (GERP) (−12 to 6, with 6 being indicative of complete conservation) and CADD (>10.0 deemed to be deleterious).

Analysis of TCGA data

To examine for a relationship between SNP genotype and mRNA expression we made use of Tumor Cancer Genome Atlas (TCGA)38 RNA-seq expression and Affymetrix 6.0 SNP data (dbGaP accession number: phs000178.v7.p6) on 223 colorectal adenocarcinoma (COAD) and 75 rectal adenocarcinoma samples using a best proxy where SNPs were not represented directly. Association between normalised RNA counts per-gene and SNP genotype was quantified using the Kruskal-Wallis trend test. The frequency of somatic mutations in CRC was obtained using the CBioPortal for Cancer Genomics3940 and TumorPortal web servers41.

Pathway analysis

To determine whether any genes mapping to the three newly identified regions act in pathways already over-represented in GWAS regions we utilized the NCI pathway interaction database42. All genes within the LD block containing each tagSNP, or linked to the SNP through functional experiments (MYC) were submitted as a Batch query using the NCI-Nature curated data source.

Assignment of microsatellite instability (MSI), KRAS, NRAS and BRAF status in cancers

Tumour MSI status in CRCs was determined using the mononucleotide microsatellite loci BAT25 and BAT26, which are highly sensitive MSI markers. Samples showing more than or equal to five novel alleles, when compared with normal DNA, at either or both markers were assigned as MSI-H (corresponding to MSI-high)43. Tumours from the COIN study were screened for mutations in KRAS codons 12, 13, and 61 and BRAF codon 600 by pyrosequencing18. Additionally, KRAS (all three codons), BRAF (codons 594 and 600), and NRAS (codons 12 and 61) were screened for mutations by MALDI-TOF mass array (Sequenom, San Diego, CA, USA)44.

Results

In the primary scan, 2,244 advanced (stage IV) CRC cases ascertained through the Medical Research Council (MRC) trials COIN18 and COIN-B45 were analysed with control data on 2,674 individuals from the WTCCC2 UK National Blood Service Control Group. After applying strict quality control criteria (Materials and Methods), we analysed 234,675 autosomal SNPs for association with CRC risk in 1,950 cases and 2,162 controls. A Q–Q plot of observed versus expected χ2-test statistics showed little evidence for an inflation of test statistics, thereby excluding the possibility of substantive hidden population substructure, cryptic relatedness among subjects or differential genotype calling (inflation factor λ = 1.05; Supplementary Figure 1). We performed a meta-analysis of our primary scan data with five non-overlapping GWAS case-control series of Northern European ancestry, which have been previously reported (Supplementary Table 2). The adequacy of the case-control matching and possibility of differential genotyping of cases and controls was assessed using Q-Q plots of test statistics. λGC values46 for the UK1, Scotland1, VQ58, CCFR1 and CCFR2 studies were 1.02, 1.01, 1.01, 1.02 and 1.03 respectively (Supplementary Figure 1). Any ethnic outliers or individuals identified as related were excluded (Supplementary Figure 2). After quality control procedures, the six GWAS provided data on 7,577 CRC cases and 9,979 controls. To maximise the prospects of identifying novel risk variants, we imputed over 10 million variants using 1000 Genomes Project Pilot data as a reference panel. Q-Q plots for all variants post-imputation did not show evidence of substantive over-dispersion introduced by imputation (Supplementary Figure 1).

Meta-analysis

Associations for all 23 established European CRC risk SNPs showed a direction of effect consistent with previously reported studies, with eight of the loci having a P-value of <5.0 × 10−8 (Supplementary Table 3; Fig. 1). Additionally six SNPs previously identified in GWAS in Asian populations as determinants of CRC risk showed evidence for an association in this meta-analysis; albeit at varying degrees of significance (P-values ranging from 3.64 × 10−2 to 1.71 × 10−3; Supplementary Table 3); thereby providing support for trans-ethnic effects.

Figure 1

Genome-wide P-values (–log10P, y-axis) plotted against their respective chromosomal positions (x-axis).

Known regions attaining genome-wide significance (i.e. P = 5.0 × 10−8) are labelled with their chromosomal location. Variants in grey lie in novel regions that reach the significance threshold level (P = 1.0 × 10−7) required for variants to be analysed further in this study. Variants in black lie in novel regions attaining genome-wide significance.

Excluding SNPs (including those correlated with r2 > 0.8) mapping to the risk loci, five variants in distinct regions of linkage disequilibrium (LD) were associated with CRC at P < 1.0 × 10−7 (Table 1; Fig. 1).

Table 1

Summary statistics for variants showing an association with CRC risk at < 1.0 × 10 .

							Individual study P-values						Meta-analysis
Locus	Nearest gene(s)	SNP	Position (bps)	Alleles	RAF	INFO	UK1	Scotland1	VQ58	CFR1	CFR2	COIN	OR (95% CI)	P	P_het
1p36.12	WNT4/CDC42	rs72647484	22,587,728	TC	0.91	0.94 (0.85-0.99)	3.25 ×10⁻²	3.32 ×10⁻⁵	4.99 ×10⁻²	4.08 ×10⁻²	4.58 ×10⁻²	3.47 ×10⁻²	1.24 (1.15-1.33)	1.21 ×10^-8	0.33
5q15	ERAP1	rs202110856	96,129,872	GGC	0.99	0.79 (0.66-0.92)	2.97 ×10⁻¹	5.96 ×10^-8	2.81 ×10⁻²	4.43 ×10⁻¹	3.35 ×10⁻¹	3.67 ×10⁻¹	1.51 (1.23-1.86)	6.67 ×10^-8	0.13
10p13	CUBN	rs10904849	16,997,266	GT	0.68	0.98 (0.97-1.00)	2.90 ×10⁻²	3.39 ×10⁻¹	2.36 ×10⁻²	8.68 ×10⁻³	7.73 ×10⁻²	1.29 ×10⁻³	1.13 (1.08-1.19)	7.01×10^-8	0.83
16p13.2	C16orf72	rs79900961	9,297,812	GA	0.98	0.70 (0.61-0.74)	2.21 ×10⁻¹	8.68 ×10⁻²	1.04 ×10⁻³	2.54 ×10⁻²	2.41 ×10⁻¹	1.02 ×10⁻³	1.49 (1.26-1.76)	4.93 ×10^-8	0.76
16q24.1	FOXL1	rs16941835	86,695,720	GC	0.21	0.97 (0.92-0.99)	1.04 ×10⁻¹	1.17 ×10⁻¹	1.57 ×10⁻⁴	3.74 ×10⁻³	1.25 ×10⁻²	3.65 ×10⁻¹	1.16 (1.09-1.22)	5.06 ×10^-8	0.40

For each variant shown along with meta-analysis test statistics are the P-values from the six individual studies and imputation Information scores. Risk alleles are given in bold.

INFO, imputation Information score; P-het, P-value of heterogeneity between studies; RAF, risk allele frequency.

We assessed the fidelity of imputation in 200 UK cases by comparing imputed genotypes with those obtained by sequencing. For the three common variants (MAF > 0.05), rs72647484, rs16941835 and rs10904849 which each had imputation info scores >0.9 there was high correlation between imputed and directly typed genotype (r2 = 0.98, 1.00 and 0.99, respectively). For the rare variant rs79900961 (MAF = 0.016), the correlation was poor (r2 = 0.60). The call rate for the rare Indel on chromosome 5q15 (rs202110856) in the sequencing data was only 71% and both imputed heterozygotes were sequenced as homozygous reference. Therefore, only the three common variants at 1p36.12, 10p13 and 16q24.1 were subject to further analyses. In the combined analysis of the six GWAS datasets, rs72647484, which maps to chromosome 1p36.12 (22,587,728 bps; NCBI build 37), showed the strongest evidence for association with CRC (P = 1.21 × 10−8; Phet = 0.33, I2 = 14%; Fig. 2a). rs72647484 maps within a 300 kb block of LD encompassing WNT4 (wingless-type mmtv integration site family, member 4; MIM 603490) and CDC42 (cell division cycle 42, MIM 116952; Fig. 3a). The second strongest association was provided by rs16941835 (P = 5.06 × 10−8; Phet = 0.40, I2 = 3%; Fig. 2c) which localises to the long non-coding RNA (lncRNA) RP11-58A18.1 at chromosome 16q24.1 (86,659,720 bps; NCBI build 37) within a 65 kb region of LD (Fig. 3c). The nearest coding gene, ~500 kb away, is the transcription factor FOXL1. The third strongest association was provided by rs10904849 (P = 7.01 × 10−8; Phet = 0.83, I2 = 0%; Fig. 2b) which localises to chromosome 10p13 (16,997,266 bps; NCBI build 37) within intron 31 of the gene encoding cubulin (CUBN; alias intrinsic factor-cobalamin receptor [IFCR], MIM 602997; Fig. 3b).

Figure 2

Forest plot of the odds ratios for the association between rs72647484, rs16941835, rs10904849 and CRC.

Studies were weighted according to the inverse of the variance of the log of the OR calculated by unconditional logistic regression. Horizontal lines: 95% confidence intervals (95% CI). Box: OR point estimate; its area is proportional to the weight of the study. Diamond (and broken line): overall summary estimate, with confidence interval given by its width. Unbroken vertical line: null value (OR = 1.0).

Figure 3

Regional plot of association results and recombination rates for the (a) 1p36.12, (b) 10p13 and (c) 16q24.1 risk loci. Association results of both genotyped (triangles) and imputed (circles) SNPs in the GWAS samples and recombination rates within the loci at 1p36.12 (a), 10p13 (b) and 16q24 (c). For each plot, −log10 P values (y axis) of the SNPs are shown according to their chromosomal positions (x axis). The top imputed SNP in each combined analysis is shown as a large triangle and is labelled by its rsID. The colour intensity of each symbol reflects the extent of LD with the top SNP: white (r2 = 0) through to dark red (r2 = 1.0), with r2 estimated from the 1000 Genomes Phase 1 data. Genetic recombination rates (cM/Mb), are shown with a light blue line. Physical positions are based on NCBI build 37 of the human genome. Also shown are the relative positions of genes and transcripts mapping to each region of association. The lower panel shows the chromatin state segmentation track (ChromHMM).

Bioinformatic analysis of risk variants

To gain insight into the biological basis of the associations we analysed publicly available RNA-seq expression and SNP data from TCGA on 223 colonic and 75 rectal cancers using rs10904850 and rs2744753 as proxies for rs10904849 (r2 = 0.97; D’ = 1.00) and rs72647484 (r2 = 0.64; D’ = 0.89) respectively. After adjustment for multiple testing, no significant associations were seen between SNP genotype and expression of genes mapping to any of the three risk loci (Supplementary Table 4). We examined whether any of the SNPs or their proxies (i.e. r2 > 0.8 in 1000 Genomes CEU reference panel) lie at putative transcription factor binding/enhancer elements and derived GERP and PhastCons scores to asses sequence conservation at these positions (Supplementary Table 5). rs16941835 maps to a regulatory feature with histone modification suggestive of an enhancer element. rs10904852, in LD with rs10904849 (r2 = 0.95, D’ = 1.00) is conserved (GERP and PhastCons scores of 1.20 and 0.47 respectively) with CADD score of 11.53. A moderate CADD score (8.21) was associated with rs7267484 (22,590,125 bps) which is strong LD with rs72647489 (r2 = 0.93, D’ = 1.00). Six proxy SNPs in LD with rs16941835 showed some evidence of transcription factor binding (Supplementary Table 5). We made use of TCGA data to examine the frequency of somatic mutation of CDC42, WNT4, FOXL1 or CUBN in CRC. None of these genes showed evidence of significant somatic mutation. Next, we conducted pathway analysis to determine whether any genes mapping to the three newly identified regions act in pathways already over-represented in GWAS. Pathways containing three or more genes are shown in Supplementary Table 6. While this analysis identifies the BMP-signalling pathway as expected, no catalogued pathways were discernible involving genes mapping to any of the newly identified regions. It is increasingly recognized that some genetic variants can have pleiotropic effects, influencing the risk of more than one cancer type. To explore the possibility that rs72647484, rs10904849 or rs16941835 affects the risk of other malignancies, we examined the association with lung cancer47, acute lymphoblastic leukaemia48, multiple myeloma49, glioma50 and meningioma51 using data from previously reported GWASs. However, for these cancers, there was no evidence of rs72647484, rs10904849 or rs16941835 (or correlated SNP r2 ≥ 0.8) being associated with tumour risk (i.e. P > 0.05). Finally, the relationship between clinico-pathological variables (sex, age at diagnosis, family history of CRC, tumour stage or microsatellite instability (MSI), KRAS-mutant status and BRAF-mutant status) and genotype at rs72647484, rs10904849 and rs16941835 was assessed by case-only logistic regression (Supplementary Table 7). There was evidence of a relationship between rs72647484 and KRAS-mutant status (P = 0.03) with the T risk allele associated with KRAS-mutant CRC; however this finding was not significant after accounting for multiple testing. None of the other SNPs showed any association with any of the clinico-pathological variables examined (i.e. P > 0.05).

Discussion

We have provided evidence supporting the existence of new susceptibility loci for CRC at 1p36.12, 10p13 and 16q24.1. The 1p36.12 association implicates WNT4 and/or CDC42 as possible determinates of CRC risk. WNT4 is part of a family of structurally related genes that encode cysteine-rich secreted glycoproteins that act as extracellular signalling factors. WNT4, WNT14, and WNT16 may play redundant roles in signalling through the CTNNB1-mediated canonical Wnt-pathway52 which is known to play a central role in colorectal tumorigenesis. Additionally, WNT4 signalling appears to play a pivotal role during organogenesis, acting as an autoinducer of mesenchyme-to-epithelial transition. Inactivating germline mutations in WNT4 cause mullerian aplasia and hyperandrogenism (MIM 158330) and are responsible for the autosomal recessive SERKAL syndrome (Sex Reversal and Kidney, Adrenal, and Lung dysgenesis; MIM 611812). A priori dysfunction of either WNT4 or CDC42 could be the biological basis for the 1p36.12 association. Cdc42 is a Ras-related GTP-binding protein with roles in establishment of cell polarity, regulation of cell morphology, motility, and cell cycle progression in mammalian cells, and malignant transformation53. Notably, Cdc42 regulates the actin cytoskeleton through activation of WASP proteins and cell polarity through GSK3-beta and APC. Rho-GTPase signalling has a documented role in the development of CRC54. Activation of Rho GTPase Cdc42 promotes adhesion and invasion in CRC55 and targeting Cdc42 with AZA197 suppresses primary colon cancer growth and prolongs survival in a xenograft model through down regulation of PAK156. Since rs10904849 is intronic to CUBN and the region of LD does not encompass any other genes or transcripts, there is a high likelihood that the functional basis of the 10p13 association is mediated through CUBN. Cubilin is the intestinal receptor for the endocytosis of intrinsic factor-vitamin B12 and a receptor in epithelial apoA-I/HDL metabolism57. Additionally cubilin is an important co-receptor in the endocytic pathway for retrieval of 25(OH)D3-DBP complexes by megalin-mediated endocytosis in the kidney58. Germline mutations in CUBN cause recessive megaloblastic anemia-1 (MGA1; MIM 261100). It is conceivable that common genetic variance in CUBN, while being insufficient to cause a “MGA type phenotype” would have physiological effects by virtue of long term effect on the cellular bioavailability of B12. Although it is entirely speculative, as epidemiological studies have yet to convincingly establish levels of B12 as a risk factor for CRC5960, its role in DNA biosynthesis makes genetically determined variation in B12 availability a plausible candidate for a role in the development of CRC. LncRNAs are regulators of transcription and are increasingly recognised as playing a role in cancer biology. While there is currently no evidence to implicate the RP11-58A18.1 lncRNA in CRC, lncRNAs CCAT1 and CCAT2 probably do play such roles6162, and it is entirely plausible that the impact of variation at 16q24.1 on risk is mediated through similar long range effects. One of the reasons for the failure to identify these CRC-loci previously is that, in addition to the issue of study power, they were not optimally tagged by SNPs featured on many commercial arrays. The power of our study to detect the major common loci conferring risks of 1.2 or greater (such as the 18q24 variant) was high. Hence, it is very unlikely there are additional CRC SNPs with similar effects for alleles with frequencies >0.2 in populations of European ancestry. In this study, we have only considered SNPs showing evidence of an association with a stipulated P-value threshold of <1 × 10−7. There exist, however, many variants with P-values just above this threshold which may also warrant investigation in a further study (Fig. 1). Hence further efforts to expand the scale of GWAS meta-analyses, in terms of both sample size and SNP coverage, and to increase the number of SNPs taken forward to large-scale replication, may identify additional variants for CRC. In conclusion, we have provided evidence for 3 new susceptibility loci for CRC. Our data also provide further evidence for the value of meta-analysis and the value of imputation as a means of enhancing the detection of novel risk loci thereby extending the utility of GWAS data.

Author Contributions

The study was designed and financial support was obtained by R.S.H., I.P.M.T., M.G.D. and J.P.C. The manuscript was drafted by R.S.H. with input from I.P.M.T., M.G.D. and J.P.C. All authors had access to data, analysis and had opportunity to contribute to drafting the manuscript. J.P.C. initiated and directed the GWAS of COIN and COIN-B. S.I. was responsible for COIN blood DNA extractions and quantification, R.H. for aliquoting and manifest preparation for Axiom genotyping and C.G.S. for somatic profiling of COIN tumours. N.A.A. and B.F.M. coordinated, and S.M.W. performed, the genotyping of COIN and COIN-B samples on the Axiom platform. T.S.M. was CI/co-CI of COIN and COIN-B, and R.K. provided access to linked clinico-pathological data. Cleaning of the COIN/COIN-B genotyping data and all statistical and bioinformatic analyses were conducted by N.W. and F.H., with contributions from S.D. and B.K., under the supervision of R.S.H. ICR - Sample preparation and genotyping were performed by A.L. and N.W. Oxford and local collaborators: subject recruitment and sample acquisition were done by E.B., M.G., L.M., R.K., D.K., and members of the CORGI Consortium. Sample preparation and genotyping were performed by C.P. Colon Cancer Genetics Group, Edinburgh and local collaborators: subject recruitment and sample acquisition were performed by S.M.F., C.H., H.C., I.D. and M.G.D., as well as members of SOCCS and COGS recruitment teams. Sample preparation was coordinated by S.M.F. Genotyping and analysis was performed and coordinated by S.M.F., C.H., M.G.D. and A.T. For the colon CFR datasets - D.D.B., A.K.W., J.H., M.J., F.S., G.C., S.G., N.L., P.N. and D.C. performed sample ascertainment and analysis.

Additional Information

How to cite this article: Al-Tassan, N. A. et al. A new GWAS and meta-analysis with 1000Genomes imputation identifies novel risk variants for colorectal cancer. Sci. Rep. 5, 10442; doi: 10.1038/srep10442 (2015).

57 in total

1. The gamma-subunit of the coatomer complex binds Cdc42 to mediate transformation.

Authors: W J Wu; J W Erickson; R Lin; R A Cerione
Journal: Nature Date: 2000-06-15 Impact factor: 49.962

2. Variation at 10p12.2 and 10p14 influences risk of childhood B-cell acute lymphoblastic leukemia and phenotype.

Authors: Gabriele Migliorini; Bettina Fiege; Fay J Hosking; Yussanne Ma; Rajiv Kumar; Amy L Sherborne; Miguel Inacio da Silva Filho; Jayaram Vijayakrishnan; Rolf Koehler; Hauke Thomsen; Julie A Irving; James M Allan; Tracy Lightfoot; Eve Roman; Sally E Kinsey; Eamonn Sheridan; Pamela Thompson; Per Hoffmann; Markus M Nöthen; Thomas W Mühleisen; Lewin Eisele; Martin Zimmermann; Claus R Bartram; Martin Schrappe; Mel Greaves; Martin Stanulla; Kari Hemminki; Richard S Houlston
Journal: Blood Date: 2013-08-30 Impact factor: 22.113

3. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal.

Authors: Jianjiong Gao; Bülent Arman Aksoy; Ugur Dogrusoz; Gideon Dresdner; Benjamin Gross; S Onur Sumer; Yichao Sun; Anders Jacobsen; Rileen Sinha; Erik Larsson; Ethan Cerami; Chris Sander; Nikolaus Schultz
Journal: Sci Signal Date: 2013-04-02 Impact factor: 8.192

4. Clinical significance of long intergenic noncoding RNA-p21 in colorectal cancer.

Authors: Haiyan Zhai; Andrew Fesler; Kristina Schee; Oystein Fodstad; Kjersti Flatmark; Jingfang Ju
Journal: Clin Colorectal Cancer Date: 2013-09-05 Impact factor: 4.481

5. Somatic profiling of the epidermal growth factor receptor pathway in tumors from patients with advanced colorectal cancer treated with chemotherapy ± cetuximab.

Authors: Christopher G Smith; David Fisher; Bart Claes; Timothy S Maughan; Shelley Idziaszczyk; Gilian Peuteman; Rebecca Harris; Michelle D James; Angela Meade; Bharat Jasani; Richard A Adams; Sarah Kenny; Richard Kaplan; Diether Lambrechts; Jeremy P Cheadle
Journal: Clin Cancer Res Date: 2013-06-05 Impact factor: 12.531

6. Dietary intake of B vitamins and methionine and colorectal cancer risk.

Authors: Julie K Bassett; Gianluca Severi; Allison M Hodge; Laura Baglietto; John L Hopper; Dallas R English; Graham G Giles
Journal: Nutr Cancer Date: 2013 Impact factor: 2.900

7. Environmental and heritable factors in the causation of cancer--analyses of cohorts of twins from Sweden, Denmark, and Finland.

Authors: P Lichtenstein; N V Holm; P K Verkasalo; A Iliadou; J Kaprio; M Koskenvuo; E Pukkala; A Skytthe; K Hemminki
Journal: N Engl J Med Date: 2000-07-13 Impact factor: 91.245

8. CCAT2, a novel noncoding RNA mapping to 8q24, underlies metastatic progression and chromosomal instability in colon cancer.

Authors: Hui Ling; Riccardo Spizzo; Yaser Atlasi; Milena Nicoloso; Masayoshi Shimizu; Roxana S Redis; Naohiro Nishida; Roberta Gafà; Jian Song; Zhiyi Guo; Cristina Ivan; Elisa Barbarotto; Ingrid De Vries; Xinna Zhang; Manuela Ferracin; Mike Churchman; Janneke F van Galen; Berna H Beverloo; Maryam Shariati; Franziska Haderk; Marcos R Estecio; Guillermo Garcia-Manero; Gijs A Patijn; David C Gotley; Vikas Bhardwaj; Imad Shureiqi; Subrata Sen; Asha S Multani; James Welsh; Ken Yamamoto; Itsuki Taniguchi; Min-Ae Song; Steven Gallinger; Graham Casey; Stephen N Thibodeau; Loïc Le Marchand; Maarit Tiirikainen; Sendurai A Mani; Wei Zhang; Ramana V Davuluri; Koshi Mimori; Masaki Mori; Anieta M Sieuwerts; John W M Martens; Ian Tomlinson; Massimo Negrini; Ioana Berindan-Neagoe; John A Foekens; Stanley R Hamilton; Giovanni Lanza; Scott Kopetz; Riccardo Fodde; George A Calin
Journal: Genome Res Date: 2013-06-24 Impact factor: 9.043

9. Activation of Rho GTPase Cdc42 promotes adhesion and invasion in colorectal cancer cells.

Authors: Lei Gao; Lan Bai; Qing zhen Nan
Journal: Med Sci Monit Basic Res Date: 2013-07-25

10. Common variation at 3q26.2, 6p21.33, 17p11.2 and 22q13.1 influences multiple myeloma risk.

Authors: Daniel Chubb; Niels Weinhold; Peter Broderick; Bowang Chen; David C Johnson; Asta Försti; Jayaram Vijayakrishnan; Gabriele Migliorini; Sara E Dobbins; Amy Holroyd; Dirk Hose; Brian A Walker; Faith E Davies; Walter A Gregory; Graham H Jackson; Julie A Irving; Guy Pratt; Chris Fegan; James Al Fenton; Kai Neben; Per Hoffmann; Markus M Nöthen; Thomas W Mühleisen; Lewin Eisele; Fiona M Ross; Christian Straka; Hermann Einsele; Christian Langer; Elisabeth Dörner; James M Allan; Anna Jauch; Gareth J Morgan; Kari Hemminki; Richard S Houlston; Hartmut Goldschmidt
Journal: Nat Genet Date: 2013-08-18 Impact factor: 38.330

65 in total

Review 1. RAB and RHO GTPases regulate intestinal crypt cell homeostasis and enterocyte function.

Authors: Xiao Zhang; Nan Gao
Journal: Small GTPases Date: 2016-05-04

2. Establishing a valid approach for estimating familial risk of cancer explained by common genetic variants.

Authors: Korbinian Weigl; Jenny Chang-Claude; Li Hsu; Michael Hoffmeister; Hermann Brenner
Journal: Int J Cancer Date: 2019-10-16 Impact factor: 7.396

Review 3. Chromatin-state discovery and genome annotation with ChromHMM.

Authors: Jason Ernst; Manolis Kellis
Journal: Nat Protoc Date: 2017-11-09 Impact factor: 13.491

4. Cohort Profile: The Colon Cancer Family Registry Cohort (CCFRC).

Authors: Mark A Jenkins; Aung Ko Win; Allyson S Templeton; Maggie S Angelakos; Daniel D Buchanan; Michelle Cotterchio; Jane C Figueiredo; Stephen N Thibodeau; John A Baron; John D Potter; John L Hopper; Graham Casey; Steven Gallinger; Loic Le Marchand; Noralane M Lindor; Polly A Newcomb; Robert W Haile
Journal: Int J Epidemiol Date: 2018-04-01 Impact factor: 7.196

Review 5. Epigenome-based cancer risk prediction: rationale, opportunities and challenges.

Authors: Martin Widschwendter; Allison Jones; Iona Evans; Daniel Reisel; Joakim Dillner; Karin Sundström; Ewout W Steyerberg; Yvonne Vergouwe; Odette Wegwarth; Felix G Rebitschek; Uwe Siebert; Gaby Sroczynski; Inez D de Beaufort; Ineke Bolt; David Cibula; Michal Zikan; Line Bjørge; Nicoletta Colombo; Nadia Harbeck; Frank Dudbridge; Anne-Marie Tasse; Bartha M Knoppers; Yann Joly; Andrew E Teschendorff; Nora Pashayan
Journal: Nat Rev Clin Oncol Date: 2018-02-27 Impact factor: 66.675

6. Genetic Risk Score Is Associated With Prevalence of Advanced Neoplasms in a Colorectal Cancer Screening Population.

Authors: Korbinian Weigl; Hauke Thomsen; Yesilda Balavarca; Jacklyn N Hellwege; Martha J Shrubsole; Hermann Brenner
Journal: Gastroenterology Date: 2018-03-21 Impact factor: 22.682

7. Cost-Effectiveness of Personalized Screening for Colorectal Cancer Based on Polygenic Risk and Family History.

Authors: Dayna R Cenin; Steffie K Naber; Anne C de Weerdt; Mark A Jenkins; David B Preen; Hooi C Ee; Peter C O'Leary; Iris Lansdorp-Vogelaar
Journal: Cancer Epidemiol Biomarkers Prev Date: 2019-11-20 Impact factor: 4.254

8. Colorectal cancer susceptibility variants and risk of conventional adenomas and serrated polyps: results from three cohort studies.

Authors: Dong Hang; Amit D Joshi; Xiaosheng He; Andrew T Chan; Manol Jovani; Manish K Gala; Shuji Ogino; Peter Kraft; Constance Turman; Ulrike Peters; Stephanie A Bien; Yi Lin; Zhibin Hu; Hongbing Shen; Kana Wu; Edward L Giovannucci; Mingyang Song
Journal: Int J Epidemiol Date: 2020-02-01 Impact factor: 7.196

9. Identification of Susceptibility Loci and Genes for Colorectal Cancer Risk.

Authors: Chenjie Zeng; Koichi Matsuda; Wei-Hua Jia; Jiang Chang; Sun-Seog Kweon; Yong-Bing Xiang; Aesun Shin; Sun Ha Jee; Dong-Hyun Kim; Ben Zhang; Qiuyin Cai; Xingyi Guo; Jirong Long; Nan Wang; Regina Courtney; Zhi-Zhong Pan; Chen Wu; Atsushi Takahashi; Min-Ho Shin; Keitaro Matsuo; Fumihiko Matsuda; Yu-Tang Gao; Jae Hwan Oh; Soriul Kim; Keum Ji Jung; Yoon-Ok Ahn; Zefang Ren; Hong-Lan Li; Jie Wu; Jiajun Shi; Wanqing Wen; Gong Yang; Bingshan Li; Bu-Tian Ji; Hermann Brenner; Robert E Schoen; Sébastien Küry; Stephen B Gruber; Fredrick R Schumacher; Stephanie L Stenzel; Graham Casey; John L Hopper; Mark A Jenkins; Hyeong-Rok Kim; Jin-Young Jeong; Ji Won Park; Kazuo Tajima; Sang-Hee Cho; Michiaki Kubo; Xiao-Ou Shu; Dongxin Lin; Yi-Xin Zeng; Wei Zheng
Journal: Gastroenterology Date: 2016-03-08 Impact factor: 22.682

10. Evaluation of genetic variants in association with colorectal cancer risk and survival in Asians.

Authors: Nan Wang; Yingchang Lu; Nikhil K Khankari; Jirong Long; Hong-Lan Li; Jing Gao; Yu-Tang Gao; Yong-Bing Xiang; Xiao-Ou Shu; Wei Zheng
Journal: Int J Cancer Date: 2017-06-21 Impact factor: 7.396