Literature DB >> 27207650

Genome-wide association study of colorectal cancer in Hispanics.

Stephanie L Schmit^1,2,3, Fredrick R Schumacher^1,2, Christopher K Edlund^1,2, David V Conti^1,2, Ugonna Ihenacho^1,2, Peggy Wan^1,2, David Van Den Berg¹, Graham Casey^1,2, Barbara K Fortini⁴, Heinz-Josef Lenz^1,2, Teresa Tusié-Luna^5,6, Carlos A Aguilar-Salinas⁵, Hortensia Moreno-Macías⁷, Alicia Huerta-Chagoya^5,6, María Luisa Ordóñez-Sánchez⁷, Rosario Rodríguez-Guillén⁷, Ivette Cruz-Bautista⁷, Maribel Rodríguez-Torres⁷, Linda Liliana Muñóz-Hernández⁷, Olimpia Arellano-Campos⁵, Donají Gómez⁷, Ulices Alvirde⁷, Clicerio González-Villalpando^8,9, María Elena González-Villalpando⁹, Loic Le Marchand¹⁰, Christopher A Haiman^1,2, Jane C Figueiredo^1,2.

Abstract

Genome-wide association studies (GWAS) have identified 58 susceptibility alleles across 37 regions associated with the risk of colorectal cancer (CRC) with P < 5×10(-8) Most studies have been conducted in non-Hispanic whites and East Asians; however, the generalizability of these findings and the potential for ethnic-specific risk variation in Hispanic and Latino (HL) individuals have been largely understudied. We describe the first GWAS of common genetic variation contributing to CRC risk in HL (1611 CRC cases and 4330 controls). We also examine known susceptibility alleles and implement imputation-based fine-mapping to identify potential ethnicity-specific association signals in known risk regions. We discovered 17 variants across 4 independent regions that merit further investigation due to suggestive CRC associations (P < 1×10(-6)) at 1p34.3 (rs7528276; Odds Ratio (OR) = 1.86 [95% confidence interval (CI): 1.47-2.36); P = 2.5×10(-7)], 2q23.3 (rs1367374; OR = 1.37 (95% CI: 1.21-1.55); P = 4.0×10(-7)), 14q24.2 (rs143046984; OR = 1.65 (95% CI: 1.36-2.01); P = 4.1×10(-7)) and 16q12.2 [rs142319636; OR = 1.69 (95% CI: 1.37-2.08); P=7.8×10(-7)]. Among the 57 previously published CRC susceptibility alleles with minor allele frequency ≥1%, 76.5% of SNPs had a consistent direction of effect and 19 (33.3%) were nominally statistically significant (P < 0.05). Further, rs185423955 and rs60892987 were identified as novel secondary susceptibility variants at 3q26.2 (P = 5.3×10(-5)) and 11q12.2 (P = 6.8×10(-5)), respectively. Our findings demonstrate the importance of fine mapping in HL. These results are informative for variant prioritization in functional studies and future risk prediction modeling in minority populations.

Entities: Chemical Disease Gene Mutation Species

Mesh：

Year: 2016 PMID： 27207650 PMCID： PMC4876992 DOI： 10.1093/carcin/bgw046

Source DB: PubMed Journal: Carcinogenesis ISSN： 0143-3334 Impact factor: 4.944

Introduction

Colorectal cancer (CRC) is the third most common cancer and the fourth leading cause of cancer deaths worldwide (1). The Hispanic/Latino (HL) population is the fastest growing ethnic group in the United States, with its size expected to reach 26.5% of the total population by 2050 (2,3). CRC remains the second most common cancer and third most common cause of cancer-related death in the USA among HL (4). Further, disparities in disease presentation and outcomes are evident in this ethnic group. Several studies have observed an increasing trend of early-onset disease (<50 years) and a greater likelihood of late-stage tumors or metastatic disease, especially in the last few decades (5–9). In addition to well-characterized environmental influences, family history is among the strongest risk factors for CRC, with genetic factors accounting for an estimated 12–35% of the variation in risk of developing the disease among Europeans (10,11). Genome-wide association studies (GWAS) of CRC have been instrumental in identifying common (MAF ≥ 5%) low penetrance susceptibility variants; such efforts have identified 58 susceptibility alleles across 37 regions associated with P < 5×10−8 (12–32). To date, the majority of CRC GWAS have been limited to non-Hispanic whites and East Asians, and the generalizability of resultant findings to other ethnic groups where CRC-specific incidence and mortality disparities exist have yet to be comprehensively explored. Specifically, HL individuals have been largely understudied in terms of genetic susceptibility to CRC. In addition, novel CRC-associated variation specific to other populations may exist due to relevant alleles being more common or to different distributions of important environmental factors. Recent examples of ethnic-specific variation are evident in other complex diseases including Latino-specific susceptibility alleles associated with breast cancer and type 2 diabetes (29,33–35). The possibility of ethnic-specific variation has not been widely studied in diverse populations in relation to risk of CRC beyond East Asians, and to a lesser extent, African Americans (26–29,36). To our knowledge, only one small study in Colombians has examined the association of genetic ancestry with colorectal adenomas and adenocarcinomas and described a positive association between African ancestry and CRC (37). Fine-mapping of genetic association signals can reduce the number of candidate single nucleotide polymorphisms (SNPs) or insertion/deletions (indels) considered for time-consuming and expensive functional follow-up of GWAS-identified risk regions (31,38–40). Fine-mapping studies in multiethnic and admixed populations have been suggested to be more powerful than studies in a single or genetically homogenous ethnic group for localizing functional variants, as shorter linkage disequilibrium (LD) blocks can help to decrease the set of SNPs correlated with a functional allele (41–44). Indeed, the limited set of prior CRC studies focusing on racial/ethnic minorities have proven informative in fine-mapping known risk regions as well as in identifying novel risk loci undetected by GWAS in non-Hispanic white populations. For example, 15 CRC risk SNPs have been discovered in East Asian and African-American populations, two from fine-mapping efforts (26–29,31,36). Individuals of Ad Mixed American (AMR) descent, including HL with diverse backgrounds of European, Native American and African ancestries, present an additional opportunity for fine-mapping because of the group’s shorter shared haplotypes around variants of all frequencies as compared to European-only populations (45). In combination, the unique LD structure and allele frequency spectrum of HL populations may assist in localizing association signals (46,47). The goals of this study were to identify novel variants conferring genetic susceptibility to CRC in the rapidly growing HL ethnic group (48) and to leverage this population’s unique LD structure for the fine-mapping of known risk regions identified by GWAS in other ethnic groups.

Materials and methods

Study participants

This investigation of genetic contributions to risk of CRC in HL includes cases and controls from three main studies. Epidemiologic and clinical characteristics of the studies are summarized in Supplementary Table 1, available at Carcinogenesis Online, and described briefly below.

Hispanic colorectal cancer study

The Hispanic Colorectal Cancer Study (HCCS) is a population-based study of individuals self-identified as Hispanic with a diagnosis of CRC. Cases are identified from the California Cancer Registry or directly from local hospitals in the Los Angeles region [LAC + USC County Hospital and University of Southern California (USC) Norris Comprehensive Cancer Center]. All men and women over 21 years of age with a first time diagnosis of CRC (ICD-O-3 codes: C18–C21) after January 1, 2008 were eligible for participation. Risk factor/dietary questionnaires, pathology reports and saliva samples (for genotyping) were collected using methodologies developed in the Colon Cancer Family Registry (70) and the Multiethnic Cohort (MEC) (71). The present study includes 950 cases recruited into the HCCS who were born in Mexico (42.3%), the USA (31.4%), Central/South America (16.6%), Cuba or the Caribbean Islands (1.8%) or Europe (0.4%). This study was approved by the University of Southern California Institutional Review Board and the California Committee for the Protection of Human Subjects, and all participants provided written informed consent.

Multiethnic cohort study

The Multiethnic cohort study (MEC) is a large prospective cohort study that includes subjects from various ethnic groups, including HL primarily from California and mainly, Los Angeles (71). Between 1993 and 1996, participants returned a self-administered baseline questionnaire that obtained general demographic, medical and risk factor information. The MEC used state driver’s license files as the primary source to identify study subjects in California. Surnames were used to identify HL individuals because race/ethnicity was not available in driver’s license files. In the cohort, incident cancer cases are identified annually through cohort linkage to population-based Surveillance, Epidemiology and End Results cancer registries in Los Angeles County as well as to the California State cancer registry in the same manner as in the HCCS. All men and women over age 21 with a first time diagnosis of CRC (ICD-O-3 codes: C18–C21) were included as eligible cases. The current study used questionnaire data and DNA samples derived from whole blood or buccal cells for 661 HL prevalent or incident CRC cases born in the USA (57.8%), Mexico (27.7%), Central/South America (9.7%), Cuba or the Caribbean Islands (4.4%), or Europe (0.2%). Individuals without a diagnosis of CRC were used as controls (n=2,106). All MEC controls self-reported being born in the USA (52.2%), Mexico (34.3%), or Central/South America (13.2%). This study was approved by the University of Southern California and the University of Hawaii Institutional Review Boards, and all participants provided informed consent.

Slim initiative in genomic medicine for the Americas

Additional controls for this CRC GWAS consisted of participants from a GWAS of type 2 diabetes conducted by the SIGMA Type 2 Diabetes Consortium. The primary goal of this consortium was to characterize the genetic basis of type 2 diabetes in four component studies: (i) Diabetes in Mexico Study (DMS, n = 472), (ii) Mexico City Diabetes Study (MCDS; n = 614), (iii) Universidad Nacional Autónoma de México (UNAM)/Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán Diabetes Study (UIDS; n = 1138) and (iv) the MEC (n = 2,106; described above) (35). Whole blood-derived DNA samples in the present analysis included SIGMA participants without a diagnosis of diabetes. All participants in DMS, MCDS and UIDS were identified as Mexican.

Genotyping and imputation

The HCCS and MEC CRC cases were genotyped using the Illumina HumanOmni2.5Exome-8v1.0 and HumanOmni2.5Exome-8v1.1 BeadChip arrays in the USC Norris Comprehensive Cancer Center Molecular Genomics Core (Los Angeles, CA, USA). MEC controls and controls derived from the Mexican SIGMA studies were genotyped using the Illumina HumanOmni2.5-4v1 SNP array at the Broad Institute Genetic Analysis Platform (Cambridge, MA, USA). Samples not passing SIGMA’s standard QC procedures for raw data, as described previously, were removed prior to downstream QC steps on the combined set of cases and controls (35). Controls from the MEC-SIGMA study (n = 93) were also genotyped on the HumanOmni2.5Exomev1 array (n = 62) and HumanOmni2.5Exome-8v1.1 array at the University of Southern California to allow for cross-platform validation. Genotype data were cleaned based on QC metrics at the individual subject and SNP levels. In brief, samples with <95% call rate, unintended replicates, sex mismatches between self-reported and genotypic predicted sex, and identity-by-descent with another sample were removed. Monomorphic SNPs, SNPs with <95% call rate and SNPs with mismatching alleles across platforms were eliminated. We also removed: SNPs with low concordance between intentional cross-platform replicates; SNPs not compared due to low call rate; SNPs discordant between platforms (using HapMap samples genotyped by Illumina); SNPs discordant in within-platform duplicates; and SNPs not present in all datasets. Eleven CRC cases from the MEC study were identified since selection for SIGMA participation, so these individuals were genotyped with controls but treated as cases for analytic purposes. All SNPs overlapping 1000 Genomes Project genotypes were matched to the forward strand. Imputation of genotypes was performed for both autosomal and X chromosome markers, and all samples were imputed together. The target panel was pre-phased using SHAPE-IT v2.r790 (72), and IMPUTE2 v.2.3.2 (73) was used to impute missing genotypes based on the multiethnic panel of reference haplotypes from Phase 3 of the 1000 Genomes Project (October 12, 2014 release) for autosomal markers and from Phase 1 of the 1000 Genomes Project (March 2012 release) for chromosome X markers (45,74). Genetic markers resulting from the imputation were required to pass stringent imputation quality and accuracy filters prior to entering the analysis phase (info ≥ 0.7, certainty ≥ 0.9, concordance ≥ 0.9) between directly measured and imputed genotypes after masking input genotypes (for genotyped markers only).

Statistical analysis

Ancestry analysis

Percent ancestry from major population subgroups was estimated for each participant using fastSTRUCTURE software with k = 4 and including HapMap3 samples (European = CEU, TSI; Asian = CHB, JPT, CHD; African = LWK, MKK, YRI) (75). Principal components analysis was conducted using EIGENSOFT v6.0.1 on a panel of ancestry informative markers derived from the literature, the Illumina Infinium HumanExome BeadChip and the Affymetrix Axiom® Exome Array (76–79). Principal components analysis was run twice, once on study samples in combination with HapMap3 samples (2254 ancestry informative markers) to identify ethnic outliers, and subsequently, with study samples only (2616 ancestry informative markers) to generate PCs for global ancestry adjustment in association analyses.

Discovery

A genome-wide association analysis with risk of CRC was conducted using 9 875 636 directly genotyped and high-quality imputed SNPs and indels with MAF ≥ 1% in our full study sample. The association between the allelic dosage of each variant, assuming a log-additive genetic model and the risk of CRC was evaluated using PLINK v1.07 (80). Per-allele odds ratios and 95% confidence intervals (CI) were estimated using unconditional logistic regression adjusted for age, sex and the first ten PCs for global ancestry. The P value threshold for statistical significance in the discovery GWAS was set at the traditional genome-wide value of 5×10−8.

Investigation of previously reported susceptibility regions

Replication of index variants: In addition to the search for novel susceptibility alleles, we examined the association between 57 previously reported susceptibility alleles (i.e. index SNPs or index variants) with MAF ≥ 1% in our study and risk of CRC. Again, association testing was conducted using unconditional logistic regression assuming a log-additive genetic model, with adjustment for age, sex and 10 PCs. Criteria for replication included (i) a consistent direction of effect with the previously published risk allele and (ii) a nominally statistically significant P value (< 0.05). Next, we characterized the broader region surrounding each index SNP (± 500 kilobase, kb) to examine generalizability in HL. We summarized the strongest association signals in our HL study along with the r 2 values corresponding to their respective index variants in the original GWAS population. Identification of secondary susceptibility alleles: Finally, we characterized independent secondary signals (i.e. novel markers) in each known susceptibility region. To accomplish this, we conducted fine-mapping in ±500kb windows surrounding each of the 57 index SNPs that had MAF ≥ 1% in our dataset. To screen for regions of interest, we calculated empirical P value thresholds for statistical significance that accounted for the number of correlated SNPs in each region. These region-specific thresholds were based on a Bonferroni correction for the number of markers needed to tag all SNPs with MAF ≥ 5% in high LD (r 2 ≥ 0.8) with the index based on the 1000 Genomes Phase I AMR population. The region-specific P values are detailed in Supplementary Table 3, available at Carcinogenesis Online. For further analysis, we selected regions in which the most highly associated SNP (i) correlated weakly with the index SNP (r 2 < 0.2 in the original discovery population) and (ii) exceeded our region-specific P value threshold. For these regions, we conducted an association analysis with logistic regression conditional on the index SNP’s dosage in R version 3.2.2. If a variant in moderate LD (r 2 ≥ 0.2) with the index demonstrated a more statistically significant association with CRC in our unconditional examination of known susceptibility regions, then we instead conditioned on that variant. A secondary signal was defined as a variant that remained associated with risk of CRC with a P-value below the region-specific threshold upon conditional analysis. LocusZoom plots with LD shading based on r 2 in 1000 Genomes AMR samples were generated to visualize the unconditional and conditional association results in each 1Mb region surrounding the index or the lead variant which was in at least moderate LD (r 2 ≥ 0.2) with the index (81). We also conducted a sensitivity analysis that excluded CRC cases with diabetes.

Results

Characteristics of study sample

Demographic and clinical characteristics of the 1611 CRC cases and 4330 controls included in this study are summarized in Table 1. Supplementary Table 1, available at Carcinogenesis Online, provides detailed descriptive statistics for participants in each component study. Case and control groups were statistically significantly different with respect to age, sex and body mass index. However, the absolute differences for age and body mass index were minimal, and as in standard GWAS practice, we adjusted for age and sex in our models. Differences in estimated European, Asian, African and Amerindian ancestries and place of birth were accounted for by adjustment for the first 10 principal components (PCs) for global ancestry. The differences in diabetes status and family history between cases and controls were driven by inclusion criteria and missing data for the Slim Initiative in Genomic Medicine for the Americas (SIGMA) controls.

Table 1.

Characteristics of Hispanic/Latino (HL) colorectal cancer cases and controls in a genome-wide association study of 5941 participants

		Cases^a	Controls^b	P
		N = 1611	N = 4330
Age [mean (SD)]		61.2 (12.3)	62.4 (10.2)	<0.01
BMI [mean (SD)]		29.2 (6.1)	27.5 (4.2)	<0.01
Sex (%)	Male	910 (56.5)	1845 (42.6)	<0.01
	Female	701 (43.5)	2485 (57.4)	<0.01
Place of birth	United States	680 (42.2)	1100 (25.4)	<0.01
	Mexico	585 (36.3)	2947 (68.1)^c
	Central or South America^d	222 (13.8)	277 (6.4)
	Europe	5 (0.3)	5 (0.1)
	Cuba or Caribbean Islands	46 (2.9)	0 (0.0)
Ancestry estimates [mean (SD)]^e
	European	0.50 (0.24)	0.39 (0.27)	<0.01
	East Asian	0.05 (0.10)	0. 03 (0.07)	<0.01
	African	0.02 (0.06)	0.01 (0.02)	<0.01
	Amerindian	0.43 (0.25)	0.57 (0.28)	<0.01
Diabetes	No	1151 (74.2)	4330 (100.0)	<0.01
	Yes	400 (25.8)	0 (0.0)	<0.01
Family history of CRC (first degree relative)	No^f	1274 (79.1)	1675 (38.7)	<0.01
	Yes	146 (9.1)	114 (2.6)	<0.01
Cancer site	Colon	987 (61.3)	—
	Rectum	347 (21.5)	—
	Other	14 (0.9)	—
Stage at diagnosis^g	0	7 (0.4)	—
	1	436 (27.1)	—
	2	259 (16.1)	—
	3	347 (21.5)	—
	4	122 (7.6)	—

aFrom the Hispanic Colorectal Cancer Study and the Multiethnic Cohort (California).

bFrom the Slim Initiative in Genomic Medicine for the Americas (California and Mexico).

cAll non-MEC SIGMA participants were assumed to have been born in Mexico.

dArgentina, Belize, Bolivia, Brazil, Chile, Colombia, Costa Rica, Ecuador, El Salvador, Guatemala, Honduras, Nicaragua, Panama, Peru or Uruguay.

e% European, Asian and African ancestries were estimated using fastSTRUCTURE with HapMap3 European, Asian and African samples (k = 4).

f2224 controls were missing family history information.

g440 cases were missing stage information.

Characteristics of Hispanic/Latino (HL) colorectal cancer cases and controls in a genome-wide association study of 5941 participants aFrom the Hispanic Colorectal Cancer Study and the Multiethnic Cohort (California). bFrom the Slim Initiative in Genomic Medicine for the Americas (California and Mexico). cAll non-MEC SIGMA participants were assumed to have been born in Mexico. dArgentina, Belize, Bolivia, Brazil, Chile, Colombia, Costa Rica, Ecuador, El Salvador, Guatemala, Honduras, Nicaragua, Panama, Peru or Uruguay. e% European, Asian and African ancestries were estimated using fastSTRUCTURE with HapMap3 European, Asian and African samples (k = 4). f2224 controls were missing family history information. g440 cases were missing stage information. When combined with HapMap3 samples, there were no outliers (>5 standard deviations from the mean) on PCs 1–3. Therefore, all samples were retained for subsequent analysis. Examination of PCs 1–10 showed that only PCs 1 and 2 were statistically significantly different between cases and controls. Supplementary Figure 1, available at Carcinogenesis Online, shows pairwise plots of the first three PCs for global ancestry in our total study sample and indicates that cases and controls still had overlapping distributions of PCs 1 and 2, supporting our ability to appropriately adjust for these as covariates.

Discovery

In total, our GWAS scan included 9 875 636 genetic variants with MAF ≥ 1% that passed stringent quality control (QC) procedures, as depicted in the Manhattan plot in Supplementary Figure 2, available at Carcinogenesis Online. A genomic control inflation factor (λ) of 1.09 indicated adequate control for population stratification (Supplementary Figure 2, available at Carcinogenesis Online). At the standard genome-wide significance level of P < 5×10−8, we did not observe any genetic markers that were statistically significantly associated with risk of CRC. However, 17 variants across 4 regions with highly suggestive CRC associations (P < 1×10−6) were identified on chromosomes 1p34.3 [rs7528276; OR = 1.86 (95% CI: 1.47–2.36); P = 2.5×10−7], 2q23.3 [rs1367374; OR = 1.37 (95% CI: 1.21–1.55); P = 4.0×10−7], 14q24.2 [rs143046984; OR = 1.65 (95% CI: 1.36–2.01); P = 4.1×10−7] and 16q12.2 [rs142319636; OR = 1.69 (95% CI: 1.37–2.08); P = 7.8×10−7] (Supplementary Table 2, available at Carcinogenesis Online).

Replication

Among the 57 previously published CRC susceptibility alleles with MAF ≥ 1% in our study, 19 (33.3%) were associated with risk of CRC in HL at a nominal level of statistical significance (P < 0.05) (Supplementary Table 3, available at Carcinogenesis Online). The known susceptibility alleles that replicated most strongly in HL included rs10505477, rs6983267 and rs7014346 at 8q24.21 (rs6983267: P = 2.8×10−5), rs3217810 at 12p13.32 (P = 2.2×10−4) and rs4939827 at 18q21.1 (P = 1.2×10−4). In HL, the 8q24.21 and 12p13.32 association signal regions were led by previously identified (‘index’) variants (rs6983267 and rs3217810, respectively), but the most strongly associated SNP at the 18q21.1 locus (rs4939827±500kb) was rs11874392 (6.4×10−8; r 2 EUR = 0.93). A comparison of effect sizes from the current study and the initial published report indicated that 76.5% of SNPs had a consistent direction of effect (Figure 1). Only two potential outliers with respect to discordance were identified: rs35509282 on 4q32.2 and rs73208120 on 12q24.22 (Figure 1). With respect to broader generalization of risk regions (index ± 500kb), the SNPs most statistically significantly associated with risk of CRC in HL are summarized in Supplementary Table 4, available at Carcinogenesis Online. In each of three regions where the top marker was not the index (12q24.22, 16q22.1 and 18q21.1), the lead variant was correlated with the index at r 2 ≥ 0.2 in the original GWAS population, suggesting that a different variant (or variant set) in HL as compared to other populations may better tag the same underlying functional element.

Figure 1.

Comparison of association effect sizes for previously published CRC risk SNPs (n = 57) in the original GWAS population and in Hispanic/Latinos. Red shading denotes P < 0.05 in the HL study. GWAS = genome-wide association study. OR = odds ratio. Pearson correlation coefficient = 0.13.

Identification of secondary susceptibility alleles

Using our fine-mapping strategy outlined in the statistical methods, we identified two risk regions on chromosomes 3q26 and 11q12.2 in which the most strongly associated variant was weakly correlated with its corresponding index SNP in the original discovery population (r 2 < 0.2) and in which the association P value was smaller than our pre-specified region-specific threshold (Table 2). The unconditional and conditional analysis results for the proposed independent SNPs are summarized in Table 2. At 3q26.2, the most statistically significantly associated SNP from the unconditional analysis was rs116626941 (OR = 1.35 (95% CI: 1.17–1.56); P = 4.0×10−5, Figure 2A). However, after conditioning on the region’s most strongly associated index-correlated variant, rs56012908, rs116626941 was no longer associated with a P value less than the pre-specified region-specific threshold of 1.3×10−4 (Figure 2D). Nonetheless, a second SNP, rs185423955, was associated with risk of CRC below our region-specific P value threshold in both unconditional and conditional analyses (Figure 2D). At 11q12.2, we identified rs60892987 as a novel secondary signal which exceeded our region-specific P value threshold after conditioning on the region’s most strongly associated variant that was in high LD with the index (OR = 1.32, P = 6.6×10−5), rs28456 (LD with the index SNP rs1535: r 2 ASN = 0.93) (Figure 2B and E).

Table 2.

Known colorectal cancer susceptibility regions harboring a variant that is statistically significantly associated with CRC with P < region-specific threshold in HL (N = 5941)

Region	rsID	Chr	Position (hg19)	Eff	Alt	Frq Eff	Info	Unconditional					Conditional
Region	rsID	Chr	Position (hg19)	Eff	Alt	Frq Eff	Info	OR	SE	95% LCL	95% UCL	P	OR	SE	95% LCL	95% UCL	P
3q26.2	rs185423955	3	1699950156	C	T	0.04	0.96	1.61	0.11	1.29	2.01	3.2E-05	1.57	0.11	1.26	1.97	7.5E−05
11q12.2	rs60892987^a	11	61982418	A	G	0.10	1.00	1.34	0.07	1.17	1.53	2.7E-05	1.32	0.07	1.15	1.51	6.6E−05
Borderline significant variants
2q32.3	rs7604359^a	2	192335294	C	A	0.05	1.00	1.43	0.09	1.19	1.71	1.2E-04	1.40	0.09	1.17	1.68	2.6E−04

Results are derived from unconditional and conditional logistic regression adjusted for age, sex and 10 PCs for global ancestry. The index or lead variant adjusted for in conditional analyses had r 2 ≥ 0.2 with the index in the original discovery population.

aDirectly genotyped.

Figure 2.

LocusZoom regional plots (±500kb from the index SNP and/or the region’s most strongly associated variant in LD (r 2 ≥ 0.2) with the index) for 2q32.3, 3q26.2 and 11q12.2 based on analyses using best genotype calls. A–C represent association results from logistic regression adjusted for age, sex and global ancestry. D–F represent association results from logistic regression adjusted for age, sex, global ancestry and allelic dosage of the known region’s lead variant. Linkage disequilibrium shading is based on 1000 Genomes Project Phase 3 AMR samples. Diamond-shaped points in purple represent these regions’ lead variants.

Known colorectal cancer susceptibility regions harboring a variant that is statistically significantly associated with CRC with P < region-specific threshold in HL (N = 5941) Results are derived from unconditional and conditional logistic regression adjusted for age, sex and 10 PCs for global ancestry. The index or lead variant adjusted for in conditional analyses had r 2 ≥ 0.2 with the index in the original discovery population. aDirectly genotyped. LocusZoom regional plots (±500kb from the index SNP and/or the region’s most strongly associated variant in LD (r 2 ≥ 0.2) with the index) for 2q32.3, 3q26.2 and 11q12.2 based on analyses using best genotype calls. A–C represent association results from logistic regression adjusted for age, sex and global ancestry. D–F represent association results from logistic regression adjusted for age, sex, global ancestry and allelic dosage of the known region’s lead variant. Linkage disequilibrium shading is based on 1000 Genomes Project Phase 3 AMR samples. Diamond-shaped points in purple represent these regions’ lead variants. We identified a third region at 2q32.3 with suggestive evidence of a secondary signal, but where the most strongly associated marker did not meet the P-value threshold for that region upon conditional analysis (P = 1.1×10−4, Table 2). Rs7604359 was in low LD with the index SNP (rs11903757) in the original GWAS population (r 2 EUR = 0.002). This SNP was statistically significantly associated with risk of CRC in our unconditional analysis (P = 1.2×10−4) but was borderline significant with respect to our pre-specified threshold in an analysis conditional on the region’s lead index-correlated variant, rs12474044 (P = 2.6×10−4). Rs7604359, approximately 250kb away from the index and downstream of the myosin IB (MYO1B) gene, represents a candidate that tags a potential independent signal in this known region (Figure 2C and F). Notably, a sensitivity analysis that excluded CRC cases with diabetes showed no appreciable differences in results for rs185423955, rs60892987 and rs7604359 (data not shown).

Discussion

Evaluating the genetic susceptibility to CRC in diverse racial/ethnic groups is important for understanding the generalizability of previous findings, localizing functional variants that underlie known risk regions, and identifying population-specific variation. Our study represents the first large-scale GWAS of CRC in the HL population, an ethnic group experiencing increasing incidence of early-onset and late-stage disease (5–9). Although we identified novel susceptibility regions with only suggestive levels of statistical evidence, our study characterized the generalizability of previous findings from studies mainly in non-Hispanic whites and East Asians to HL. Importantly, our fine-mapping analyses identified two known risk loci (rs185423955/3q26.2 and rs60892987/11q12.2) with a novel secondary association signal within 500kb of the index SNP(s), which may help guide future risk modeling in HL. Germline genetic studies of racial/ethnic minorities present a unique opportunity to better understand factors contributing to disparities in CRC incidence and to narrow down the list of candidate variants in known risk regions for functional follow-up and risk modeling. Here, we sought to characterize novel genome-wide genetic variation as well as to better understand the association of known susceptibility SNPs in relation to the risk of developing CRC in HL. Although our study was not powered to identify low-penetrance risk variants at the conventional genome-wide significant level of 5×10−8, we did identify 17 variants across 4 regions with suggestive CRC associations (P < 1×10−6). The variants with the most statistically significant associations with risk of CRC in all four of these regions (rs7528276, rs1367374, rs143046984 and rs142319636) warrant special attention during replication efforts, as examples of HL-specific variation have been demonstrated in relation to risk of CRC, cancers at other organ sites and other complex diseases. For example, two concurrent studies of CRC with gene discovery conducted in Japanese and African American, and East Asian subjects, respectively, identified a novel genome-wide significant risk locus at 10q25.2 (28,36). Other illustrative examples of disease-associated variants that are common in HL but rare in other populations include a breast cancer susceptibility SNP at 6q25 (5′ of the Estrogen Receptor 1 gene) (34) and a type II diabetes risk haplotype spanning SLC16A11 from SIGMA study of genetic risk factors for type II diabetes (35). Of particular interest in the present study is the suggestive risk region on chromosome 1p34.3, which lies in an intron of the microtubule-actin crosslinking factor 1 (MACF1) gene. MACF1 (formerly ACF7) knockdown experiments in a mouse embryonic carcinoma cell line resulted in the inhibition of Wnt signaling, a critical signal transduction pathway in the colon, while knockdown in colonic mucosa led to altered mucosal epithelial arrangement and proliferation due to changes in cytoskeleton dynamics (49,50). Interestingly, MACF1 is found in complexes with APC and is regulated by GSK3 in skin and breast carcinoma cells (51,52). Next, our study investigated the ability to confirm known susceptibility alleles, and more broadly, characterized variants in the surrounding regions in HL. We replicated approximately 33% of the 57 previously identified CRC risk SNPs available for analysis (MAF ≥ 1%) with P < 0.05. This is comparable to 31% of the first 29 risk SNPs replicated in a similarly sized study of African-Americans (31). Further, 29 risk regions (±500kb surrounding the index) included a SNP in at least moderate LD with the index (r 2 ≥ 0.2 in the original discovery ethnic group) that was associated with risk of CRC at P < 0.05. Potential explanations for the lack of replication of some susceptibility alleles in HL include: (i) limited power due to modest sample size and differences in allele frequencies across populations; (ii) differential tagging of the underlying functional/causal variant across racial/ethnic groups with different LD structures and (iii) true biologic heterogeneity, potentially driven by differences in the distribution of important environmental and lifestyle factors across populations. An additional goal of this study was to fine-map known risk regions in an effort to identify novel secondary signals potentially specific to the HL population. Our analysis revealed two known risk regions (3q26.2 and 11q12.2) with at least one variant weakly correlated with the index that exceeded a region-specific threshold for statistical significance. For each region, this study identified a putative secondary signal following analysis conditional on the index or a more strongly-associated proxy (r 2 ≥ 0.2 with the index). For the region on 2q32.3, a borderline secondary signal was identified. Further, we observed that known risk loci at 8q24.21 and 18q21.1 harbored SNPs exceeding the respective region’s P value threshold. The risk locus at 3q26.2 was among the earliest CRC susceptibility regions identified using a GWAS meta-analysis approach. The index SNP, rs10936599, lies in a coding region of the myoneurin gene (MYNN), which encodes a zinc finger protein with largely unknown function (53). However, it also lies upstream of the telomerase RNA component (TERC) locus, and the SNP has been associated with longer telomere length (54,55). It has also been associated with increased risk of other cancers including multiple myeloma, bladder cancer, and chronic lymphocytic leukemia (56–58). Our study did not replicate the index SNP but found an additional variant within 100kb, rs185423955, that was statistically significantly associated with risk of CRC in HL in both unconditional and conditional analysis. This SNP leads a putative secondary association signal. This SNP is an intronic variant in the protein kinase C, iota form (PRKCI) gene that encodes a protein implicated in Ras-mediated transformation of colon tissues and CRC progression (59–62). Our findings provide additional evidence in support of this susceptibility locus and highlight the complicated nature of this gene-rich region in relation to risk of CRC. At 11q12.2, four highly correlated risk SNPs (forming two common haplotypes) have been identified through prior work in East Asians (28). These SNPs were hypothesized to affect the development of CRC as expression quantitative trait loci for FEN1, FADS1 and/or FADS2 (28,63). Our examination of the surrounding ~1 megabase (Mb) region identified a secondary signal through conditional analysis that is tagged by the intergenic SNP rs60892987. This SNP lies about 384kb upstream of the most strongly associated index SNP (rs1535) and about 393kb from the region’s lead variant correlated with that index (rs28456). Rs60892987 and rs28456 are in low LD with each other (r 2 AMR = 0.04; r 2 ASN = 0.01), suggesting the independent nature of these signals. However, it is possible that both SNPs are in moderate LD with a shared functional variant(s) yet to be discovered. This region was recently identified and has not yet been a major focus of fine-mapping or functional characterization efforts aside from quantitative trait loci analysis. The potential biological significance of this gene-rich region remains largely unknown, and our identification of a novel secondary association signal supports the complexity of this locus. An intergenic SNP between the nucleic acid binding protein 1 (NABP1) and serum deprivation response phosphatidylserine-binding protein (SDPR) genes on 2q32.3, rs11903757, was first identified as a CRC susceptibility allele in a multiethnic sample of Europeans (discovery and replication) and East Asians (replication) (12). Subsequently, this SNP was replicated in one recent meta-analysis of Europeans and East Asians but not in a second (20,32). Although the most proximal gene is not necessarily the most functionally relevant, it is worth noting that single-stranded DNA binding protein NABP1’s potential link to cancer development has been suggested in relation to the maintenance of genomic instability as a component of the sensor of ssDNA (SOSS) complex; also, NAPB1 was found in a proteomic screen of differentially expressed proteins in CRCs (64,65). Here, we did not replicate this index SNP, but we did observe that rs12474044, a SNP in LD with the index that is approximately 10kb away (r 2 AMR = 0.12; r 2 EUR = 0.16), was associated with risk of CRC at a nominal level of statistical significance. Further, we identified a suggestive novel secondary signal led by rs7604359, centromeric to the previously published tag SNP. This marker is in low LD with the index SNP (r 2 AMR = 0.000; r 2 EUR = 0.002), which is not surprising given that it is adjacent to a recombination hotspot. This SNP lies downstream of MYO1B, a gene that encodes myosin, a molecular motor protein. In general, this risk region has not been a focus of prior fine-mapping or functional characterization efforts, so additional evidence is needed to replicate this potential independent signal in other populations. Finally, we examined in detail the well-characterized CRC susceptibility regions at 8q24.21 and 18q21.1, both of which had SNPs that reached our region-specific thresholds despite being in high LD with their index variants. The low penetrance risk locus at 8q24.21 was the first to be identified in association with CRC, and three index SNPs in the region have been described previously (16–19,27,66–68). In this region, our investigation suggested that the index SNP, rs6983267, is also the best marker of CRC risk in HL. The CRC risk region at 18q21.1 was the second genome-wide significant locus to be identified in European ancestry populations (23). The index SNP, rs4939827, lies within intron 4 of SMAD7, a gene that encodes a critical regulator of TGF-β signaling. A study in East Asians recently identified another genome-wide significant SNP, rs7229639, which is uncorrelated with rs4939827 in Asians (r 2 ASN = 0.02) and weakly correlated in Europeans (r 2 EUR = 0.10) (29). We replicate this finding at a nominal level of statistical significance for rs7229639 [OR (95% CI) = 1.12 (1.01, 1.25); P = 3.9×10−2] but find that the variant in this region most strongly associated with CRC risk in our HL study is rs11874392 (OR (95% CI) = 1.27 (1.17, 1.39); P = 6.9×10−8). This region has been a recent focus of in-depth functional exploration, and it has been proposed that the increased CRC risk is driven by four variants (rs6507874, rs6507875, rs8085824 and rs58920878) that have allele-specific enhancer effects on SMAD7 expression (69). In our study, all four variants replicated at P < 0.005 levels of statistical significance (data not shown). These results should be interpreted in the context of the study’s limitations. Primarily, this investigation with a modest sample size was underpowered for detecting novel low-penetrance susceptibility alleles (MAF < 5%) at the standard genome-wide significant level of statistical significance. However, we did identify potential novel regions and secondary-signals in known regions that merit follow-up in future studies. Thus, the lack of novel risk regions at the P < 5×10−8 level should not be interpreted as definitive evidence regarding the absence of population-specific variation influencing CRC risk among HL. With regard to fine-mapping, the incorporation of imperfect genotype imputation can be limiting. However, imputation based on a multiethnic reference panel performed exceptionally well for common variation in this population, as evidenced by high info scores in Supplementary Table 2, available at Carcinogenesis Online. Further, conditional analysis is not necessarily the most robust approach for identifying independent signals in a region, and SNPs with the smallest P values for association often are not generally the most likely functional candidates. Finally, there are few CRC studies in HL populations with high-throughput genotype data available and therefore, replication of our fine-mapping findings is difficult at this time. In summary, this study demonstrates the utility of conducting genetic studies in racial/ethnic minorities to better understand the complicated genetic architecture of known risk regions. Future work is needed to replicate and evaluate the biological significance of the identified top variants in known regions and secondary signals, to conduct admixture mapping and to examine the ability of these newly identified SNPs to improve HL-specific risk prediction modeling.

Supplementary material

Supplementary Tables 1–4 and Figures 1 and 2 can be found at http://carcin.oxfordjournals.org/

Funding

This work was supported by the National Institutes of Health [R01CA155101 to J.C.F., U01HG004726 to C.A.H., R01CA140561 to D.V.C. and F.R.S., T32ES013678 to S.L.S, U19CA148107 and P30CA014089].

75 in total

1. Principal components analysis corrects for stratification in genome-wide association studies.

Authors: Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich
Journal: Nat Genet Date: 2006-07-23 Impact factor: 38.330

2. Sequence variants in SLC16A11 are a common risk factor for type 2 diabetes in Mexico.

Authors: Amy L Williams; Suzanne B R Jacobs; Hortensia Moreno-Macías; Alicia Huerta-Chagoya; Claire Churchhouse; Carla Márquez-Luna; Humberto García-Ortíz; María José Gómez-Vázquez; Noël P Burtt; Carlos A Aguilar-Salinas; Clicerio González-Villalpando; Jose C Florez; Lorena Orozco; Christopher A Haiman; Teresa Tusié-Luna; David Altshuler
Journal: Nature Date: 2013-12-25 Impact factor: 49.962

3. Incidence and mortality rates for colorectal cancer in Puerto Rico and among Hispanics, non-Hispanic whites, and non-Hispanic blacks in the United States, 1998-2002.

Authors: Marievelisse Soto-Salgado; Erick Suárez; William Calo; Marcia Cruz-Correa; Nayda R Figueroa-Vallés; Ana P Ortiz
Journal: Cancer Date: 2009-07-01 Impact factor: 6.860

4. A common genetic risk factor for colorectal and prostate cancer.

Authors: Christopher A Haiman; Loïc Le Marchand; Jennifer Yamamato; Daniel O Stram; Xin Sheng; Laurence N Kolonel; Anna H Wu; David Reich; Brian E Henderson
Journal: Nat Genet Date: 2007-07-08 Impact factor: 38.330

5. Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012.

Authors: Jacques Ferlay; Isabelle Soerjomataram; Rajesh Dikshit; Sultan Eser; Colin Mathers; Marise Rebelo; Donald Maxwell Parkin; David Forman; Freddie Bray
Journal: Int J Cancer Date: 2014-10-09 Impact factor: 7.396

6. Common genetic variants at the CRAC1 (HMPS) locus on chromosome 15q13.3 influence colorectal cancer risk.

Authors: Emma Jaeger; Emily Webb; Kimberley Howarth; Luis Carvajal-Carmona; Andrew Rowan; Peter Broderick; Axel Walther; Sarah Spain; Alan Pittman; Zoe Kemp; Kate Sullivan; Karl Heinimann; Steven Lubbe; Enric Domingo; Ella Barclay; Lynn Martin; Maggie Gorman; Ian Chandler; Jayaram Vijayakrishnan; Wendy Wood; Elli Papaemmanuil; Steven Penegar; Mobshra Qureshi; Susan Farrington; Albert Tenesa; Jean-Baptiste Cazier; David Kerr; Richard Gray; Julian Peto; Malcolm Dunlop; Harry Campbell; Huw Thomas; Richard Houlston; Ian Tomlinson
Journal: Nat Genet Date: 2007-12-16 Impact factor: 38.330

7. Multiple common susceptibility variants near BMP pathway loci GREM1, BMP4, and BMP2 explain part of the missing heritability of colorectal cancer.

Authors: Ian P M Tomlinson; Luis G Carvajal-Carmona; Sara E Dobbins; Albert Tenesa; Angela M Jones; Kimberley Howarth; Claire Palles; Peter Broderick; Emma E M Jaeger; Susan Farrington; Annabelle Lewis; James G D Prendergast; Alan M Pittman; Evropi Theodoratou; Bianca Olver; Marion Walker; Steven Penegar; Ella Barclay; Nicola Whiffin; Lynn Martin; Stephane Ballereau; Amy Lloyd; Maggie Gorman; Steven Lubbe; Bryan Howie; Jonathan Marchini; Clara Ruiz-Ponte; Ceres Fernandez-Rozadilla; Antoni Castells; Angel Carracedo; Sergi Castellvi-Bel; David Duggan; David Conti; Jean-Baptiste Cazier; Harry Campbell; Oliver Sieber; Lara Lipton; Peter Gibbs; Nicholas G Martin; Grant W Montgomery; Joanne Young; Paul N Baird; Steven Gallinger; Polly Newcomb; John Hopper; Mark A Jenkins; Lauri A Aaltonen; David J Kerr; Jeremy Cheadle; Paul Pharoah; Graham Casey; Richard S Houlston; Malcolm G Dunlop
Journal: PLoS Genet Date: 2011-06-02 Impact factor: 5.917

8. A novel colorectal cancer risk locus at 4q32.2 identified from an international genome-wide association study.

Authors: Stephanie L Schmit; Fredrick R Schumacher; Christopher K Edlund; David V Conti; Leon Raskin; Flavio Lejbkowicz; Mila Pinchev; Hedy S Rennert; Mark A Jenkins; John L Hopper; Daniel D Buchanan; Noralane M Lindor; Loic Le Marchand; Steven Gallinger; Robert W Haile; Polly A Newcomb; Shu-Chen Huang; Gad Rennert; Graham Casey; Stephen B Gruber
Journal: Carcinogenesis Date: 2014-07-14 Impact factor: 4.741

9. Common variation at 3q26.2, 6p21.33, 17p11.2 and 22q13.1 influences multiple myeloma risk.

Authors: Daniel Chubb; Niels Weinhold; Peter Broderick; Bowang Chen; David C Johnson; Asta Försti; Jayaram Vijayakrishnan; Gabriele Migliorini; Sara E Dobbins; Amy Holroyd; Dirk Hose; Brian A Walker; Faith E Davies; Walter A Gregory; Graham H Jackson; Julie A Irving; Guy Pratt; Chris Fegan; James Al Fenton; Kai Neben; Per Hoffmann; Markus M Nöthen; Thomas W Mühleisen; Lewin Eisele; Fiona M Ross; Christian Straka; Hermann Einsele; Christian Langer; Elisabeth Dörner; James M Allan; Anna Jauch; Gareth J Morgan; Kari Hemminki; Richard S Houlston; Hartmut Goldschmidt
Journal: Nat Genet Date: 2013-08-18 Impact factor: 38.330

10. Protein kinase Ciota is required for Ras transformation and colon carcinogenesis in vivo.

Authors: Nicole R Murray; Lee Jamieson; Wangsheng Yu; Jie Zhang; Yesim Gökmen-Polar; Deborah Sier; Panos Anastasiadis; Zoran Gatalica; E Aubrey Thompson; Alan P Fields
Journal: J Cell Biol Date: 2004-03-15 Impact factor: 10.539

16 in total

Review 1. Genome-Wide Association Studies of Cancer in Diverse Populations.

Authors: Sungshim L Park; Iona Cheng; Christopher A Haiman
Journal: Cancer Epidemiol Biomarkers Prev Date: 2017-06-21 Impact factor: 4.254

2. Familial Risk and Heritability of Colorectal Cancer in the Nordic Twin Study of Cancer.

Authors: Rebecca E Graff; Sören Möller; Michael N Passarelli; John S Witte; Axel Skytthe; Kaare Christensen; Qihua Tan; Hans-Olov Adami; Kamila Czene; Jennifer R Harris; Eero Pukkala; Jaakko Kaprio; Edward L Giovannucci; Lorelei A Mucci; Jacob B Hjelmborg
Journal: Clin Gastroenterol Hepatol Date: 2017-01-24 Impact factor: 11.382

3. Novel colon cancer susceptibility variants identified from a genome-wide association study in African Americans.

Authors: Hansong Wang; Stephanie L Schmit; Christopher A Haiman; Temitope O Keku; Ikuko Kato; Julie R Palmer; David van den Berg; Lynne R Wilkens; Terrilea Burnett; David V Conti; Fredrick R Schumacher; Lisa B Signorello; William J Blot; Krista A Zanetti; Curtis Harris; Mala Pande; Sonja I Berndt; Polly A Newcomb; Dee W West; Robert Haile; Daniel O Stram; Jane C Figueiredo; Loïc Le Marchand
Journal: Int J Cancer Date: 2017-03-28 Impact factor: 7.396

Review 4. Cancer Epidemiology in Hispanic Populations: What Have We Learned and Where Do We Need to Make Progress?

Authors: Laura Fejerman; Amelie G Ramirez; Anna María Nápoles; Scarlett Lin Gomez; Mariana C Stern
Journal: Cancer Epidemiol Biomarkers Prev Date: 2022-05-04 Impact factor: 4.090

5. Novel Common Genetic Susceptibility Loci for Colorectal Cancer.

Authors: Stephanie L Schmit; Christopher K Edlund; Fredrick R Schumacher; Jian Gong; Tabitha A Harrison; Jeroen R Huyghe; Chenxu Qu; Marilena Melas; David J Van Den Berg; Hansong Wang; Stephanie Tring; Sarah J Plummer; Demetrius Albanes; M Henar Alonso; Christopher I Amos; Kristen Anton; Aaron K Aragaki; Volker Arndt; Elizabeth L Barry; Sonja I Berndt; Stéphane Bezieau; Stephanie Bien; Amanda Bloomer; Juergen Boehm; Marie-Christine Boutron-Ruault; Hermann Brenner; Stefanie Brezina; Daniel D Buchanan; Katja Butterbach; Bette J Caan; Peter T Campbell; Christopher S Carlson; Jose E Castelao; Andrew T Chan; Jenny Chang-Claude; Stephen J Chanock; Iona Cheng; Ya-Wen Cheng; Lee Soo Chin; James M Church; Timothy Church; Gerhard A Coetzee; Michelle Cotterchio; Marcia Cruz Correa; Keith R Curtis; David Duggan; Douglas F Easton; Dallas English; Edith J M Feskens; Rocky Fischer; Liesel M FitzGerald; Barbara K Fortini; Lars G Fritsche; Charles S Fuchs; Manuela Gago-Dominguez; Manish Gala; Steven J Gallinger; W James Gauderman; Graham G Giles; Edward L Giovannucci; Stephanie M Gogarten; Clicerio Gonzalez-Villalpando; Elena M Gonzalez-Villalpando; William M Grady; Joel K Greenson; Andrea Gsur; Marc Gunter; Christopher A Haiman; Jochen Hampe; Sophia Harlid; John F Harju; Richard B Hayes; Philipp Hofer; Michael Hoffmeister; John L Hopper; Shu-Chen Huang; Jose Maria Huerta; Thomas J Hudson; David J Hunter; Gregory E Idos; Motoki Iwasaki; Rebecca D Jackson; Eric J Jacobs; Sun Ha Jee; Mark A Jenkins; Wei-Hua Jia; Shuo Jiao; Amit D Joshi; Laurence N Kolonel; Suminori Kono; Charles Kooperberg; Vittorio Krogh; Tilman Kuehn; Sébastien Küry; Andrea LaCroix; Cecelia A Laurie; Flavio Lejbkowicz; Mathieu Lemire; Heinz-Josef Lenz; David Levine; Christopher I Li; Li Li; Wolfgang Lieb; Yi Lin; Noralane M Lindor; Yun-Ru Liu; Fotios Loupakis; Yingchang Lu; Frank Luh; Jing Ma; Christoph Mancao; Frank J Manion; Sanford D Markowitz; Vicente Martin; Koichi Matsuda; Keitaro Matsuo; Kevin J McDonnell; Caroline E McNeil; Roger Milne; Antonio J Molina; Bhramar Mukherjee; Neil Murphy; Polly A Newcomb; Kenneth Offit; Hanane Omichessan; Domenico Palli; Jesus P Paredes Cotoré; Julyann Pérez-Mayoral; Paul D Pharoah; John D Potter; Conghui Qu; Leon Raskin; Gad Rennert; Hedy S Rennert; Bridget M Riggs; Clemens Schafmayer; Robert E Schoen; Thomas A Sellers; Daniela Seminara; Gianluca Severi; Wei Shi; David Shibata; Xiao-Ou Shu; Erin M Siegel; Martha L Slattery; Melissa Southey; Zsofia K Stadler; Mariana C Stern; Sebastian Stintzing; Darin Taverna; Stephen N Thibodeau; Duncan C Thomas; Antonia Trichopoulou; Shoichiro Tsugane; Cornelia M Ulrich; Franzel J B van Duijnhoven; Bethany van Guelpan; Joseph Vijai; Jarmo Virtamo; Stephanie J Weinstein; Emily White; Aung Ko Win; Alicja Wolk; Michael Woods; Anna H Wu; Kana Wu; Yong-Bing Xiang; Yun Yen; Brent W Zanke; Yi-Xin Zeng; Ben Zhang; Niha Zubair; Sun-Seog Kweon; Jane C Figueiredo; Wei Zheng; Loic Le Marchand; Annika Lindblom; Victor Moreno; Ulrike Peters; Graham Casey; Li Hsu; David V Conti; Stephen B Gruber
Journal: J Natl Cancer Inst Date: 2019-02-01 Impact factor: 13.506

6. Genomic mechanisms of fatigue in survivors of colorectal cancer.

Authors: David S Black; Steve W Cole; Georgia Christodoulou; Jane C Figueiredo
Journal: Cancer Date: 2018-03-26 Impact factor: 6.921

7. Trends in colorectal cancer mortality in hispanics: a SEER analysis.

Authors: Afsaneh Barzi; Dongyun Yang; Sayedamin Mostofizadeh; Heinz-Josef Lenz
Journal: Oncotarget Date: 2017-10-19

8. Role of glucose metabolism related gene GLUT1 in the occurrence and prognosis of colorectal cancer.

Authors: Wenming Feng; Ge Cui; Cheng-Wu Tang; Xiao-Lan Zhang; Chuang Dai; Yong-Qiang Xu; Hui Gong; Tao Xue; Hui-Hui Guo; Ying Bao
Journal: Oncotarget Date: 2017-05-23

9. Stepwise approach to SNP-set analysis illustrated with the Metabochip and colorectal cancer in Japanese Americans of the Multiethnic Cohort.

Authors: John Cologne; Lenora Loo; Yurii B Shvetsov; Munechika Misumi; Philip Lin; Christopher A Haiman; Lynne R Wilkens; Loïc Le Marchand
Journal: BMC Genomics Date: 2018-07-09 Impact factor: 3.969

10. Association of genetic ancestry with colorectal tumor location in Puerto Rican Latinos.

Authors: Julyann Pérez-Mayoral; Marievelisse Soto-Salgado; Ebony Shah; Rick Kittles; Mariana C Stern; Myrta I Olivera; María Gonzalez-Pons; Segundo Rodriguez-Quilichinni; Marla Torres; Jose S Reyes; Luis Tous; Nicolas López; Victor Carlo Chevere; Marcia Cruz-Correa
Journal: Hum Genomics Date: 2019-02-20 Impact factor: 4.639