Literature DB >> 23770605

Genome-wide association study identifies multiple risk loci for chronic lymphocytic leukemia.

Sonja I Berndt1, Christine F Skibola, Vijai Joseph, Nicola J Camp, Alexandra Nieters, Zhaoming Wang, Wendy Cozen, Alain Monnereau, Sophia S Wang, Rachel S Kelly, Qing Lan, Lauren R Teras, Nilanjan Chatterjee, Charles C Chung, Meredith Yeager, Angela R Brooks-Wilson, Patricia Hartge, Mark P Purdue, Brenda M Birmann, Bruce K Armstrong, Pierluigi Cocco, Yawei Zhang, Gianluca Severi, Anne Zeleniuch-Jacquotte, Charles Lawrence, Laurie Burdette, Jeffrey Yuenger, Amy Hutchinson, Kevin B Jacobs, Timothy G Call, Tait D Shanafelt, Anne J Novak, Neil E Kay, Mark Liebow, Alice H Wang, Karin E Smedby, Hans-Olov Adami, Mads Melbye, Bengt Glimelius, Ellen T Chang, Martha Glenn, Karen Curtin, Lisa A Cannon-Albright, Brandt Jones, W Ryan Diver, Brian K Link, George J Weiner, Lucia Conde, Paige M Bracci, Jacques Riby, Elizabeth A Holly, Martyn T Smith, Rebecca D Jackson, Lesley F Tinker, Yolanda Benavente, Nikolaus Becker, Paolo Boffetta, Paul Brennan, Lenka Foretova, Marc Maynadie, James McKay, Anthony Staines, Kari G Rabe, Sara J Achenbach, Celine M Vachon, Lynn R Goldin, Sara S Strom, Mark C Lanasa, Logan G Spector, Jose F Leis, Julie M Cunningham, J Brice Weinberg, Vicki A Morrison, Neil E Caporaso, Aaron D Norman, Martha S Linet, Anneclaire J De Roos, Lindsay M Morton, Richard K Severson, Elio Riboli, Paolo Vineis, Rudolph Kaaks, Dimitrios Trichopoulos, Giovanna Masala, Elisabete Weiderpass, María-Dolores Chirlaque, Roel C H Vermeulen, Ruth C Travis, Graham G Giles, Demetrius Albanes, Jarmo Virtamo, Stephanie Weinstein, Jacqueline Clavel, Tongzhang Zheng, Theodore R Holford, Kenneth Offit, Andrew Zelenetz, Robert J Klein, John J Spinelli, Kimberly A Bertrand, Francine Laden, Edward Giovannucci, Peter Kraft, Anne Kricker, Jenny Turner, Claire M Vajdic, Maria Grazia Ennas, Giovanni M Ferri, Lucia Miligi, Liming Liang, Joshua Sampson, Simon Crouch, Ju-Hyun Park, Kari E North, Angela Cox, John A Snowden, Josh Wright, Angel Carracedo, Carlos Lopez-Otin, Silvia Bea, Itziar Salaverria, David Martin-Garcia, Elias Campo, Joseph F Fraumeni, Silvia de Sanjose, Henrik Hjalgrim, James R Cerhan, Stephen J Chanock, Nathaniel Rothman, Susan L Slager.   

Abstract

Genome-wide association studies (GWAS) have previously identified 13 loci associated with risk of chronic lymphocytic leukemia or small lymphocytic lymphoma (CLL). To identify additional CLL susceptibility loci, we conducted the largest meta-analysis for CLL thus far, including four GWAS with a total of 3,100 individuals with CLL (cases) and 7,667 controls. In the meta-analysis, we identified ten independent associated SNPs in nine new loci at 10q23.31 (ACTA2 or FAS (ACTA2/FAS), P=1.22×10(-14)), 18q21.33 (BCL2, P=7.76×10(-11)), 11p15.5 (C11orf21, P=2.15×10(-10)), 4q25 (LEF1, P=4.24×10(-10)), 2q33.1 (CASP10 or CASP8 (CASP10/CASP8), P=2.50×10(-9)), 9p21.3 (CDKN2B-AS1, P=1.27×10(-8)), 18q21.32 (PMAIP1, P=2.51×10(-8)), 15q15.1 (BMF, P=2.71×10(-10)) and 2p22.2 (QPCT, P=1.68×10(-8)), as well as an independent signal at an established locus (2q13, ACOXL, P=2.08×10(-18)). We also found evidence for two additional promising loci below genome-wide significance at 8q22.3 (ODF1, P=5.40×10(-8)) and 5p15.33 (TERT, P=1.92×10(-7)). Although further studies are required, the proximity of several of these loci to genes involved in apoptosis suggests a plausible underlying biological mechanism.

Entities:  

Mesh:

Year:  2013        PMID: 23770605      PMCID: PMC3729927          DOI: 10.1038/ng.2652

Source DB:  PubMed          Journal:  Nat Genet        ISSN: 1061-4036            Impact factor:   41.307


Despite limited discovery stages (<1,125 cases), genome-wide association studies (GWAS) have successfully identified 13 loci associated with risk of chronic lymphocytic leukemia/small lymphocytic lymphoma (CLL). To identify additional CLL susceptibility loci, we conducted the largest meta-analysis, to date, including four GWAS totaling 3,100 CLL cases and 7,667 controls with genotype data. In the meta-analysis, we discovered ten independent SNPs in nine novel loci at 10q23.31 (ACTA2/FAS; P=1.22×10−14), 18q21.33 (BCL2; P=7.76×10−11), 11p15.5 (C11orf21; P=2.15×10−10), 4q25 (LEF1; P=4.24×10−10), 2q33.1 (CASP10/CASP8; P=2.50×10−9), 9p21.3 (CDKN2B-AS1; P=1.27×10−8), 18q21.32 (PMAIP1; P=2.51×10−8), 15q15.1 (BMF; P=2.71×10−10), and 2p22.2 (QPCT; P=1.68×10−8) as well as an independent signal at an established locus (2q13, ACOXL, P=2.08×10−18). We also found evidence for two additional promising loci that reached marginal genome-wide significance (P<2.0×10−7) at 8q22.3 (ODF1; P=5.40×10−8) and 5p15.33 (TERT; P=1.92×10−7). Although further studies are required, proximity of several of these loci to genes involved in apoptosis suggests a plausible underlying biological mechanism. CLL is a B-cell malignancy with a strong familial component[1] and an ~8.5-fold increased relative risk in first-degree relatives.[2] Previous CLL GWAS have identified 13 loci that explain a portion of the familial risk,[3-6] suggesting that additional loci of modest effects can be found using a larger discovery sample size.[7] As part of a larger initiative in non-Hodgkin lymphoma (NHL) (called the NHL-GWAS), we genotyped 2,343 CLL cases and 2,854 controls of European descent from 22 studies using the Illumina OmniExpress Beadchip (see Online Methods and Supplementary Table 1). Of those 5,197 subjects, 94% passed rigorous quality control criteria (see Online Methods and Supplementary Table 2) and 549,934 SNPs successfully passed quality control criteria with a median call rate >98%. We also utilized genotype data previously generated on the Illumina Omni2.5 from an additional 3,536 controls and one case from three studies[8] giving a total of 2,179 cases and 6,221 controls for the analysis of the NHL-GWAS (Supplementary Table 3). In the NHL-GWAS (Stage 1) analysis, we observed an enrichment of SNPs with small P-values compared to the null distribution with a lambda of 1.026 in the Q-Q plot (Supplementary Figure 1). After exclusion of previously established loci, an excess of small P-values still remained suggesting additional novel loci were yet to be discovered. In our Stage 1 analyses, we observed SNPs from 10 unique loci (defined as separated by at least 500kb and linkage disequilibrium (LD) r2<0.05), which reached genome-wide significance (P<5×10−8), including eight established loci and two novel loci (Supplementary Figure 2). We then performed a meta-analysis of the NHL-GWAS with three other independent CLL GWAS[5,9] that had a combined total of 921 CLL cases and 1,446 controls (Stage 2, Supplementary Tables 1 and 3). Because these other CLL GWAS studies were conducted on different commercial SNP microarrays, we imputed common SNPs from the 1000 Genomes Project[10] using IMPUTE2[11] (Online Methods, Supplementary Table 4). In the meta-analysis of stages 1 and 2 data, associations for all 13 established loci showed a consistent direction of effect with previously reported studies, and 10 loci achieved P<5×10−8 (Supplementary Table 5). However, two previously established loci, 15q25.2 and 19q13.3, were only nominally significant in the meta-analysis (P=0.03, and P=0.008, respectively), and no significant association was observed in stage 1 for the 15q25.2 locus (P=0.10). A suggestive locus on 18q21.1 that had not met genome-wide significance in prior studies[12] was also nominally significant (P=5.06×10−4) herein. From the meta-analysis of stages 1–2, we identified 10 promising SNPs in the eight novel loci and one promising SNP in an established locus that we carried forward for a de novo replication in stage 3: this included an additional 392 cases and 4561 controls and in silico replication in an independent CLL GWAS with 396 cases and 311 controls (see Online Methods and Supplementary Tables 1, 3, and 4). Seven of the 10 SNPs in novel loci reached genome-wide significance in the meta-analysis of all three stages: 10q23.31 (ACTA2/FAS; P=1.22×10−14), 18q21.33 (BCL2; P=2.66×10−12), 11p15.5 (C11orf21; P=2.15×10−10), 4q25 (LEF1; P=4.24×10−10), 2q33.1 (CASP10/CASP8; P=2.50×10−9), 9p21.3 (CDKN2B-AS1; P=1.27×10−8), and 18q21.32 (PMAIP1; P=2.51×10−8) (Table 1, Figure 1). Further, within the 18q21.33 locus, a second SNP (rs4987852) in low LD (r2=0.01) with rs4987855 and located only 372 bp away, also reached genome-wide significance (Table 1, =7.76×10−11); this SNP was determined to be independent in conditional analyses (Pconditional =3.87×10−7, Table 2).
Table 1

Association results for novel loci and new independent SNPs

ChrNearest gene(s)SNPRiskOtherp
PositionalleleaalleleRAFStageOR (95% CI)
Novel loci
10q23.31ACTA2, FASrs 440673790,749,704GA0.57Stage 11.30 (1.21–1.40)3.30 × 10−12
Stage 21.17 (1.03–1.32)0.01
Stage 31.27 (1.06–1.52)0.007
Combinedb1.27 (1.19–1.33)1.22 × 10−14
18q21.33BCL2rs4987855*58,944,529GA0.91Stage 11.47 (1.28–1.69)5.51 × 10−8
Stage 21.47 (1.18–1.85)0.0007
Stage 31.43 (1.12–1.82)0.004
Combinedb1.47 (1.32–1.61)2.66 × 10−12
rs 498785258,944,901GA0.06Stage 11.43 (1.26–1.63)2.67×10−8
Stage 21.24 (0.98–1.56)0.07
Stage 31.52 (1.17–1.97)0.002
Combinedb1.41 (1.27–1.56)7.76 × 10−11
11p15.5C11orf21, TSPAN32rs 79440042,267,728TG0.49Stage 11.19 (1.11–1.28)7.20×10−7
Stage 21.15 (1.02–1.32)0.03
Stage 31.27 (1.11–1.45)0.0006
Combinedb1.20 (1.13–1.27)2.15 × 10−10
4q25LEF1rs898518*109,236,273AC0.59Stage 11.16 (1.08–1.24)8.47×10−5
Stage 21.26 (1.11–1.43)0.0004
Stage 31.30 (1.14–1.49)0.0002
Combinedb1.20 (1.14–1.27)4.24 × 10−10
2q33.1CASP10, CASP8rs 3769825201,819,625TC0.45Stage 11.18 (1.10–1.27)3.43×10−6
Stage 21.16 (1.03–1.32)0.01
Stage 31.22 (1.07–1.40)0.004
Combinedb1.19 (1.12–1.25)2.50 × 10−9
9p21.3CDKN2B-AS1rs 167901322,196,987CT0.52Stage 11.18 (1.10–1.27)4.47×10−6
Stage 21.32 (1.12–1.52)0.0004
Stage 31.11 (0.93–1.32)0.25
Combinedb1.19 (1.12–1.27)1.27 × 10−8
18q21.32PMAIP1rs 436825355,773,267CT0.69Stage 11.18 (1.09–1.27)3.65×10−5
Stage 21.24 (1.08–1.41)0.002
Stage 31.18 (1.02–1.37)0.03
Combinedb1.19 (1.12–1.27)2.51 × 10−8
15q15.1BMFrs802403338,190,949CG0.51Stage 11.22 (1.14–1.32)2.72×10−8
Stage 21.22 (1.08–1.39)0.003
Stage 3--
Combinedb1.22 (1.15–1.30)2.71 × 10−10
2p22.2QPCT, PRKD3rs377074537,449,593TC0.22Stage 11.29 (1.18–1.40)8.23×10−9
Stage 21.10 (0.95–1.28)0.21
Stage 3--
Combinedb1.24 (1.15–1.33)1.68 × 10−8
New independent SNP in established locus
2q13ACOXL, BCL2L11rs13401811*111,332,575GA0.81Stage 11.43 (1.28–1.56)9.76×10−13
Stage 21.45 (1.23–1.72)9.39×10−6
Stage 31.32 (1.08–1.59)0.007
Combinedb1.41 (1.30–1.52)2.08 × 10−18

The risk allele is the allele corresponding to the estimated odds ratio; RAF= risk allele frequency in controls; OR= per allele odds ratio adjusted for age, sex and significant principal components.

Number of cases and controls in the joint analysis of stage 1+stage2+stage3: rs4406737 (3,481/12,170), 20 rs4987855 (3,883/12,446), rs4987852 (3,880/12,497), rs7944004 (3,869/12,476), rs898518 (3,879/12,441), rs3769825 (3,885/12,471), rs1679013 (3,482/12,148), rs4368253 (3,882/12,473), rs8024033 (3096/7663), rs3770745 (3097/7663), rs13401811 (3,839/12,264).

For the ICGC study in stage 3, results for proxy SNPs were provided (rs4987856/rs4987855, r2=1.0; rs7698317/rs898518, r2=1.0; rs1554005/rs13401811, r2=1.0).

Identified from the 1000 Genomes meta-analysis of stage 1 and stage 2 with imputation information >0.9 in the NHL-GWAS.

Figure 1

Association results, recombination hot-spots, and linkage disequilibrium (LD) plots for the regions newly associated with CLL

Top, association results of GWAS data from Stage 1 NHL-GWAS (grey diamonds), Stage 2 combined data (blue diamond), Stage 3 combined data (purple diamond), and Stages 1–3 combined data (red diamond) are shown in the top panel with −log10(P) values (left y axis). Overlaid are the likelihood ratio statistics (right y axis) to estimate putative recombination hotspots across the region on the basis of 5 unique sets of 100 randomly selected control samples. Bottom, LD heatmap based on r2 values from total control populations for all SNPs included in the GWAS. (a) 10q23.31 region; (b) 18q21.33 region; (c) 11p15.5 region; (d) 4q25 region; (e) 2q33.1 region; (f) 9p21.3 region; (g) 18q21.32 region; (h) 15q15.1 region; (i) 2p22.2 region.

Table 2

Conditional analyses for select SNPs

New SNPChrPositionNearest geneORaPaConditionalConditionalEstablishedr2 *ORcPcConditionalConditional
ORbPbSNPORdPd
rs134018112q13111,332,575ACOXL, BCL2L111.436.09×10−171.351.60×10−12rs174834660.021.373.53×10−171.316.70×10−13
rs75781992q37.3241,841,521HDLBP, FARP21.205.39×10−71.196.10×10−6rs7579780.011.291.35×10−71.262.37×10−6
rs92733636p21.3232,734,250HLA1.242.24×10−101.243.50×10−9rs6743130.211.135.00×10−41.060.11
rs92733636p21.3232,734,250HLA1.242.24×10−101.233.14×10−9rs92725350.111.187.60×10−61.120.002
rs1163680215q21.354,562,889MNS11.411.68×10−131.381.54×10−9rs71694310.161.271.72×10−51.060.32
rs3574816718q21.3256,188,413PMAIP1, MC4R1.329.31×10−91.257.89×10−7rs4368253e0.0031.192.82×10−71.185.76×10−7
rs498785218q21.3358,944,901BCL21.417.76×10−111.361.50×10−8rs4987855e0.011.472.66×10−121.411.33×10−10

r2 linkage disequilibrium is based on 1000 Genomes Project and is between the new SNP and established SNP in the locus

OR per allele odds ratio and P for the new SNP from the unconditional meta-analysis based on stage 1 + 2 for all loci, except 18q21.33. Data from stages 1–3 was used for 18q21.33.

OR and P for the new SNP from the conditional meta-analysis

OR and P for the established SNP from the unconditional meta-analysis

OR and P for the established SNP from the conditional meta-analysis

SNP discovered and confirmed in the current study

To explore these regions in greater detail and identify additional loci that we may have missed using just the genotyped SNPs in Stage 1, we imputed Stage 1 of our NHL-GWAS using the 1000 Genomes Project[10] data (February 2012 release) and performed a meta-analysis of the results from stage 1 and stage 2. The most significant SNPs at three of our novel loci, 10q23.31 (rs2147420) 18q21.33 (rs4987856), and 4q25 (rs2003869), were highly correlated (r2 ≥0.95) with our strongest genotyped SNPs, rs4406737, rs4987885, and rs898518, respectively (Supplementary Table 6). Only modest correlation (r2 range: 0.18–0.58) was observed for the most significant imputed SNPs at 11p15.5 (rs2521269), 2q33.1 (rs11688943), and 9p21.3 (rs1359742) and our strongest genotyped SNPs in each of the respective regions. The most significant of the imputed SNPs at 18q21.32 (rs35748167) appeared to be independent of our strongest genotyped SNP (rs4368253, r2=0.003, Pconditional < 7.89×10−7 for both SNPs), suggesting a possible second, independent signal (Table 2). Meta-analysis of our imputed scan data revealed two novel loci, 15q15.1 (BMF; P=2.71×10−10) and 2p22.2 (QPCT; P=1.68×10−8) (Table 1, Figure 1). In addition, although our genotyped SNP at 5p15.33 (TERT, rs10069690, P=1.92×10−7) (Supplementary Table 7) did not reach genome-wide significance, we did observe an imputed SNP in this region that reached genome-wide significance (rs7705526; P=3.75×10−8). Another promising locus was observed at 8q22.3 (ODF1; P=5.40×10−8) (Supplementary Table 7). Additional studies are needed to confirm these findings, particularly the signal on 5p15.33, which is already known to harbor risk variants for multiple cancers.[13-20], An examination of established loci revealed a new SNP in 2q13 (BCL2L11, rs13401811, P=6.09×10−17; Table 1, Figure 2) that was independent of the previously reported SNP. After conditioning on the established 2q13 SNP (rs17483466, r2=0.02), the new SNP rs13401811 remained strongly associated with CLL risk (Pconditional=1.60×10−12, Table 2). A putative second signal was observed at the established 2q37.3 locus (Supplementary Table 5, rs7578199, P =5.39×10−7) that was in low LD (r2=0.01) and independent of the previously reported rs757978 SNP (Pconditional=6.10×10−6, Table 2), although rs7578199 was not genome-wide significant. Another possible second signal was observed on 6p21.32 (Supplementary Table 5, HLA, rs9273363, P=2.24×10−10). Rs9273363 showed some evidence of conditional independence with the originally reported SNPs (r2≤0.25, Pconditional ≤3.50×10−9, Table 2); however, it may be part of a shared HLA haplotype; thus accurate HLA typing is needed to further clarify its level of independence. Finally, we observed a SNP at 15q21.3 (Supplementary Table 5, rs11636802, P=1.68×10−13) that had stronger statistical significance than that of the previously reported SNP, rs7169431 (P=1.72×10−05). Although only modestly correlated (r2=0.16), rs11636802 explained all of the risk associated with rs7169431 in a conditional analysis (Table 2) suggesting that this SNP may be a better marker for the locus.
Figure 2

Association results, recombination hot-spots, and linkage disequilibrium (LD) plot for the new independent CLL susceptibility SNP in the 2q13 established locus

Top, association results of GWAS data from Stage 1 NHL-GWAS (grey diamonds), Stage 2 combined data (blue diamond), Stage 3 combined data (purple diamond), and Stages 1–3 combined data (red diamond) are shown in the top panel with −log10(P) values (left y axis). Overlaid are the likelihood ratio statistics (right y axis) to estimate putative recombination hotspots across the region on the basis of 5 unique sets of 100 randomly selected control samples. Bottom, LD heatmap based on r2 values from total control populations for all SNPs included in the GWAS.

Heritability analysis indicated that the ten independent SNPs in our novel loci together with the new independent SNP at 2q13 (Table 1) explain approximately 5% more of the familial risk in addition to ~12% for the established loci. When we explored the contribution of all common variants to the genetic heritability of CLL (using a method that estimates the variance explained by fitting all genotyped autosomal SNPs simultaneously[21,22], Online Methods) 21,22 21,22 we estimate that common SNPs have the potential to explain up to ~46% of the familial risk, suggesting more common loci, likely of small effects, are still yet to be discovered. However, the analysis also implies that common SNPs probably do not explain all of the familial risk and other factors, such as uncommon SNPs with modest effects or rare highly penetrant variants, are likely to also play a role. Five of the novel loci (10q23.31, 18q21.33, 2q33.1, 18q21.32, and 15q15.1) identified in this study as well as the new SNP at the established 2q13 locus are located in or near genes involved in apoptosis. Rs4406737 is located on 10q23.31 between the first and second exons of FAS, a member of the tumor necrosis factor receptor superfamily that has a crucial role in the initiation of the signaling cascade of the caspase family in apoptosis. Mutations in FAS leading to defective Fas-mediated apoptosis have been documented in inherited lymphoproliferative disorders associated with autoimmunity,[23,24] and families with germline FAS mutations have a substantially increased risk of other lymphoma subtypes.[25] The two newly identified SNPs at 18q21.33 (rs4987855 and rs4987852) map to the 3′-UTR of B-cell CLL/lymphoma 2 (BCL2), which encodes an essential outer mitochondrial membrane protein that blocks lymphocyte apoptosis. Constitutive expression of BCL2 through t(14:18) and other translocations is common in follicular lymphomas, but the translocation is also seen in CLL albeit rarely.[26] Both SNPs are located within a narrow region of BCL2 where the majority of t(14;18) translocation breakpoints occur.[27] rs4987855 is in linkage disequilibrium with a SNP (rs4987856, r2=1.0) that is located within 200bp of a putative microRNA binding site for mir-195[28] and was found to be nominally correlated with BCL2 expression (Supplementary Table 8, P=0.02)[29]. Forced overexpression of BCL2 in mice leads to an increased incidence of B-cell lymphomas.[30] The novel SNPs at 18q21.32 and 15q15.1 as well as the new SNP at the established 2q13 locus are located near Bcl-2 family member genes. Rs4368253 is located approximately 51kb downstream from phorbol-12-myristat-13-acetate-induced protein 1 (PMAIP1), which encodes the proapoptotic BCL2 protein, NOXA. Regulation of apoptosis through NOXA is critical for B-cell expansion after antigen triggering.[31] Down-regulation of NOXA contributes to the persistence of CLL B-cells in the lymph node environment.[32] Rs8024033 is located approximately 5.4kb upstream of Bcl-2 modifying factor (BMF), which encodes an apoptotic activator that binds to BCL2 proteins. BMF has been implicated in the survival of chronic lymphocytic leukemia cells[33], and loss of BMF in mice leads to B-cell hyperplasia and an accelerated development of radiation-induced thymic lymphomas[34]. The new SNP (rs13401811) at 2q13, a locus previously implicated in risk of CLL[3,35,36] and more generally B-cell non-Hodgkin lymphomas,[37] is located approximately 262kb upstream of BCL2-like 11 (BCL2L11). BCL2L11 encodes a pro-apoptotic member of the BCL2 family, BIM, which plays a key role in the regulation of apoptosis in T- and B-cell homeostasis. Loss of BIM accelerates Myc-induced leukemia in mice,[38] and this SNP has been previously reported to be nominally associated with CLL in a small candidate gene study.[39] The novel 2q33.1 SNP (rs3769825) resides in intron 2 of caspase-8 (CASP8) and is in LD with a missense SNP (rs13006529, r2=0.71) in the nearby caspase-10 (CASP10) (Supplementary Table 9), both of which play a central role in cell apoptosis. SNPs within this region have been associated with breast cancer,[40] esophageal cancer,[41] and melanoma[42] susceptibility. SNPs in CASP8/CASP10, including one in moderate LD with ours (rs11674246, r2=0.66), were previously nominally associated with CLL risk in smaller case-control studies.[43,44] The remaining four novel loci (11p15.5, 4q25, 9p21.3 and 2p22.2) map to other biologically interesting genes. The 4q25 SNP, rs898518, is located between the fourth and fifth exons of lymphoid enhancer-binding factor 1 (LEF1), which encodes a transcription factor involved in the Wnt signaling pathway, an essential component for the normal homeostasis of hematopoietic stem cells.[45] Aberrant protein expression of LEF1 has been observed in CLL cells as well as monoclonal B-cell lymphocytosis, suggesting that LEF1 plays an early role in CLL leukemogenesis.[46] Rs1679013 maps to an inter-genic region on 9p21.3, roughly 200kb upstream fromCDKN2B-AS1, an antisense non-coding RNA implicated in the risk of acute lymphocytic leukemia.[47] The 2p22.2 SNP (rs3770745) is located approximately 52kb upstream of protein kinase D3 (PRKD3), which interacts with transcriptional repressor, B-cell lymphoma 6 (BCL-6). Lastly, the 11p15.5 region contains many imprinted genes and has been implicated in Beckwith-Wiedemann syndrome,[48] a disorder characterized by excessive growth and a high incidence of childhood tumors.[49] In conclusion, our large GWAS of CLL identified ten SNPs in nine novel loci and one new independent SNP in a previously discovered locus. Together with the previously established loci, the cumulative set of SNPs correspond to an area-under-the-curve (AUC) of 0.73. Although further studies are required to fine-map the regions, the proximity of several of these loci to genes involved in apoptosis suggests a possible underlying mechanism of biological relevance. Our results further support a substantial contribution of common gene variants in the pathogenesis of CLL.

ONLINE METHODS

Stage 1: NHL-GWAS

As part of a larger initiative, we conducted a genome-wide association study (GWAS) of CLL using cases and controls of European descent from 22 studies of non-Hodgkin lymphoma (NHL) (Supplementary Table 1), including nine prospective cohort studies, eight population-based case-control studies, and five clinic or hospital-based case-control studies. All studies obtained informed consent from their participants and approval from their respective Institutional Review Boards for this study. As described in Supplementary Table 1, cases were ascertained from cancer registries, clinics or hospitals, or through self-report verified by medical and pathology reports. The phenotype information for all NHL cases was reviewed centrally at the International Lymphoma Epidemiology Consortium (InterLymph) Data Coordinating Center and harmonized according to the hierarchical classification proposed by the InterLymph Pathology Working Group based on the World Health Organization (WHO) classification (2008).[50,51] All CLL cases with sufficient DNA (n=2,343) and a subset of available controls frequency-matched by age and sex to cases (n=2,854) including 4% quality control duplicates were genotyped on the Illumina OmniExpress at the NCI Cancer Genomic Research Laboratory (CGR). Genotypes were called using Illumina GenomeStudio software, and quality control duplicates showed >99% concordance. Extensive quality control metrics were applied to the data. Monomorphic SNPs and SNPs with a call rate <93% were excluded. Samples with a call rate ≤93%, mean heterozygosity <0.25 or >0.33 based on the autosomal SNPs, or gender discordance (>5% heterozygosity on X chromosome for males and <20% heterozygosity on the X chromosome for females) were excluded. Unexpected duplicates (>99.9% concordance) and first-degree relatives based on identity by descent (IBD) sharing with Pi-hat>0.40 were removed. Ancestry was assessed using the GLU struct.admix module based on the method proposed by Pritchard et al,[52] and participants with <80% European ancestry were excluded (Supplementary Figure 3). After exclusions, 2,178 (93%) cases and 2,685 (94%) controls remained (Supplementary Table 2). Genotype data previously generated on the Illumina Omni2.5 from additional 3,536 controls and 1 case from three of the studies (ATBC, CPSII, and PLCO) were also included,[8] resulting in a total of 2,179 cases and 6,221 controls for the stage 1 analysis. Of these additional controls, 703 (~235 from each study) were selected to be representative of their cohort and cancer-free[8]. The remaining 2,823 controls were cancer-free controls from an unpublished study of prostate cancer in PLCO. SNPs with call rate <99%, with Hardy-Weinberg equilibrium P-value<1×10−6 or minor allele frequency <1% were excluded from analysis, leaving 549,934 SNPs for analysis. To evaluate population substructure, a principal components analysis (PCA) was performed using the Genotyping Library and Utilities (GLU), version 1.0, struct.pca module, which is similar to EIGENSTRAT.[53] Plots of the first ten principal components are shown in Supplementary Figure 4. Association testing was conducted assuming a log-additive genetic model, adjusting for age, sex, and significant principal components. All data analysis and management was conducted using GLU.

Stage 2: Three Independent CLL GWAS

Three independent CLL GWAS provided genotype data for a meta-analysis (Supplementary Table 1). In all three studies, subjects with a genotyping call rate <95%, duplicates, related individuals, and SNPs with a call rate <95% were removed prior to imputation (Supplementary Table 4). Imputation was conducted separately for each study using IMPUTE2[11] and a hybrid of the 1000 Genomes Project version 2 (February 2012 release) and Division of Cancer Epidemiology and Genetics (DCEG) European reference panels.[8,10] SNPs were imputed for a total of 921 cases and 1446 controls. Association testing was conducted for each study using SNPTEST version 2, adjusting for age, sex, and significant principal components for GEC and UCSF2. No principal components were significant for the Utah study.

Stage 3: Replication studies and technical validation

In stage 3, 10 SNPs in the most promising loci and one SNP from an established locus were taken forward for de novo replication in an additional 392 cases and 4561 controls from the NCI replication study (NCI Rep) and from the Utah/Sheffield Chronic Lymphocytic Leukemia study (Utah-Sheffield) (Supplementary Table 1). Additionally, these 10 SNPs were also taken forward in an in silico replication in 396 CLL cases and 311 controls from the International Cancer Genome Consortium (ICGC) (Supplementary Table 1). Genotyping for the NCI Rep study was conducted using custom TaqMan genotyping assays (Applied Biosystems) at the NCI Core Genotyping Resource and genotyping for the Utah-Sheffield study was conducted at the Core Research Facilities at the University of Utah. Blind duplicates (~5%) yielded 100% concordance. The ICGC study provided results for eight SNPs (or proxies) that were genotyped on the Affymetrix 6.0 SNP microarray (Supplementary Table 4). Association results for the NCI Rep and Utah-Sheffield studies were adjusted for age and sex, and results from the ICGC were adjusted for age, sex, and significant principal components. A comparison of the genotyping calls from the OmniExpress microarray and confirmatory TaqMan assays (n=384) yielded 99.9% concordance.

Meta analysis

Meta-analyses were performed using the fixed effects inverse variance method based on the beta estimates and standard errors from each study. For all SNPs in Tables 1 and 2, no substantial heterogeneity was observed among studies in stage 1 or among studies in stages 1–3 combined after Bonferroni correction (Pheterogeneity ≥ 0.02 for all SNPs).

Further follow-up analyses

Using 1000 Genomes data, we identified SNPs with r2>0.7 with our lead SNP that were reported to be non-synonymous or nonsense variants. We utilized HaploReg[54] which is a tool for exploring non-coding functional annotation using ENCODE data, to evaluate the genome surrounding our SNPs (Supplementary Table 9). In addition, we evaluated cis associations between all novel and promising SNPs discovered in this study and the expression of nearby genes in lymphoblastoid cell lines from subjects of European descent from three publically available datasets[29,55,56] (Supplementary Table 8).

Heritability analyses

To evaluate the familial risk explained by the novel loci identified in this study, we estimated the contribution of each SNP to the heritability using the equation[7], h2SNP=β22f(1−f), where β is the log-odds ratio per copy of the risk allele and f is the allele frequency, and then summed the contributions of all novel SNPs. Using the equation derived by Pharoah et al[57] to estimate the total heritability from the sibling relative risk (RR=8.5 from Goldin et al[2]), we then calculated the proportion of familial risk explained by dividing the summed contributions of the novel SNPs by the total heritability. To estimate the contribution of all common SNPs to familial risk, we used the method proposed by Yang et al[21], (which was extended to dichotomous traits[22] and implemented in the Genome-wide Complex Trait Analysis (GCTA) software.[58] The genetic similarity matrix was estimated from our discovery scan using all genotyped autosomal SNPs with a minor allele frequency >0.01. We used restricted maximum likelihood (REML), the default option for GCTA, to fit the appropriate variance components model that included the top 10 eigenvectors as covariates. The final estimate of heritability on the underlying liability scale assumed that the lifetime risk of CLL was 0.005. From this estimate, we calculated the proportion of familial risk explained based on a familial relative risk of 8.5. Details of fitting the variance components model and transforming from the observed to liability scale have been previously documented.[22]

Estimate of recombination hotspots

To identify recombination hotspots in the region we used SequenceLDhot[59], a program that uses the approximate marginal likelihood method[60] and calculates likelihood ratio statistics at a set of possible hotspots. We tested five unique sets of 100 control samples. PHASE v2.1 program was used to calculate background recombination rates[61,62] and LD heatmap was visualized in r2 using snp.plotter program.[63]
  63 in total

1.  Application of coalescent methods to reveal fine-scale rate variation and recombination hotspots.

Authors:  Paul Fearnhead; Rosalind M Harding; Julie A Schneider; Simon Myers; Peter Donnelly
Journal:  Genetics       Date:  2004-08       Impact factor: 4.562

2.  Common variation at 6p21.31 (BAK1) influences the risk of chronic lymphocytic leukemia.

Authors:  Susan L Slager; Christine F Skibola; Maria Chiara Di Bernardo; Lucia Conde; Peter Broderick; Shannon K McDonnell; Lynn R Goldin; Naomi Croft; Amy Holroyd; Shelley Harris; Jacques Riby; Daniel J Serie; Neil E Kay; Timothy G Call; Paige M Bracci; Eran Halperin; Mark C Lanasa; Julie M Cunningham; Jose F Leis; Vicki A Morrison; Logan G Spector; Celine M Vachon; Tait D Shanafelt; Sara S Strom; Nicola J Camp; J Brice Weinberg; Estella Matutes; Neil E Caporaso; Rachel Wade; Martin J S Dyer; Claire Dearden; James R Cerhan; Daniel Catovsky; Richard S Houlston
Journal:  Blood       Date:  2012-06-13       Impact factor: 22.113

3.  GCTA: a tool for genome-wide complex trait analysis.

Authors:  Jian Yang; S Hong Lee; Michael E Goddard; Peter M Visscher
Journal:  Am J Hum Genet       Date:  2010-12-17       Impact factor: 11.025

4.  A genome-wide association study of global gene expression.

Authors:  Anna L Dixon; Liming Liang; Miriam F Moffatt; Wei Chen; Simon Heath; Kenny C C Wong; Jenny Taylor; Edward Burnett; Ivo Gut; Martin Farrall; G Mark Lathrop; Gonçalo R Abecasis; William O C Cookson
Journal:  Nat Genet       Date:  2007-09-16       Impact factor: 38.330

5.  C-T variant in a miRNA target site of BCL2 is associated with increased risk of human papilloma virus related cervical cancer--an in silico approach.

Authors:  G Reshmi; Ramachandran Surya; V T Jissa; P S Saneesh Babu; N R Preethi; W S Santhi; P G Jayaprakash; M Radhakrishna Pillai
Journal:  Genomics       Date:  2011-06-17       Impact factor: 5.736

6.  The development of lymphomas in families with autoimmune lymphoproliferative syndrome with germline Fas mutations and defective lymphocyte apoptosis.

Authors:  S E Straus; E S Jaffe; J M Puck; J K Dale; K B Elkon; A Rösen-Wolff; A M Peters; M C Sneller; C W Hallahan; J Wang; R E Fischer; C E Jackson; A Y Lin; C Bäumler; E Siegert; A Marx; A K Vaishnaw; T Grodzicky; T A Fleisher; M J Lenardo
Journal:  Blood       Date:  2001-07-01       Impact factor: 22.113

7.  Estimation of effect size distribution from genome-wide association studies and implications for future discoveries.

Authors:  Ju-Hyun Park; Sholom Wacholder; Mitchell H Gail; Ulrike Peters; Kevin B Jacobs; Stephen J Chanock; Nilanjan Chatterjee
Journal:  Nat Genet       Date:  2010-06-20       Impact factor: 38.330

8.  Bim is a suppressor of Myc-induced mouse B cell leukemia.

Authors:  Alexander Egle; Alan W Harris; Philippe Bouillet; Suzanne Cory
Journal:  Proc Natl Acad Sci U S A       Date:  2004-04-12       Impact factor: 11.205

9.  Lung cancer susceptibility locus at 5p15.33.

Authors:  James D McKay; Rayjean J Hung; Valerie Gaborieau; Paolo Boffetta; Amelie Chabrier; Graham Byrnes; David Zaridze; Anush Mukeria; Neonilia Szeszenia-Dabrowska; Jolanta Lissowska; Peter Rudnai; Eleonora Fabianova; Dana Mates; Vladimir Bencko; Lenka Foretova; Vladimir Janout; John McLaughlin; Frances Shepherd; Alexandre Montpetit; Steven Narod; Hans E Krokan; Frank Skorpen; Maiken Bratt Elvestad; Lars Vatten; Inger Njølstad; Tomas Axelsson; Chu Chen; Gary Goodman; Matt Barnett; Melissa M Loomis; Jan Lubiñski; Joanna Matyjasik; Marcin Lener; Dorota Oszutowska; John Field; Triantafillos Liloglou; George Xinarianos; Adrian Cassidy; Paolo Vineis; Francoise Clavel-Chapelon; Domenico Palli; Rosario Tumino; Vittorio Krogh; Salvatore Panico; Carlos A González; José Ramón Quirós; Carmen Martínez; Carmen Navarro; Eva Ardanaz; Nerea Larrañaga; Kay Tee Kham; Timothy Key; H Bas Bueno-de-Mesquita; Petra Hm Peeters; Antonia Trichopoulou; Jakob Linseisen; Heiner Boeing; Göran Hallmans; Kim Overvad; Anne Tjønneland; Merethe Kumle; Elio Riboli; Diana Zelenika; Anne Boland; Marc Delepine; Mario Foglio; Doris Lechner; Fumihiko Matsuda; Helene Blanche; Ivo Gut; Simon Heath; Mark Lathrop; Paul Brennan
Journal:  Nat Genet       Date:  2008-11-02       Impact factor: 38.330

10.  Genome-wide association study of follicular lymphoma identifies a risk locus at 6p21.32.

Authors:  Lucia Conde; Eran Halperin; Nicholas K Akers; Kevin M Brown; Karin E Smedby; Nathaniel Rothman; Alexandra Nieters; Susan L Slager; Angela Brooks-Wilson; Luz Agana; Jacques Riby; Jianjun Liu; Hans-Olov Adami; Hatef Darabi; Henrik Hjalgrim; Hui-Qi Low; Keith Humphreys; Mads Melbye; Ellen T Chang; Bengt Glimelius; Wendy Cozen; Scott Davis; Patricia Hartge; Lindsay M Morton; Maryjean Schenk; Sophia S Wang; Bruce Armstrong; Anne Kricker; Sam Milliken; Mark P Purdue; Claire M Vajdic; Peter Boyle; Qing Lan; Shelia H Zahm; Yawei Zhang; Tongzhang Zheng; Nikolaus Becker; Yolanda Benavente; Paolo Boffetta; Paul Brennan; Katja Butterbach; Pierluigi Cocco; Lenka Foretova; Marc Maynadié; Silvia de Sanjosé; Anthony Staines; John J Spinelli; Sara J Achenbach; Timothy G Call; Nicola J Camp; Martha Glenn; Neil E Caporaso; James R Cerhan; Julie M Cunningham; Lynn R Goldin; Curtis A Hanson; Neil E Kay; Mark C Lanasa; Jose F Leis; Gerald E Marti; Kari G Rabe; Laura Z Rassenti; Logan G Spector; Sara S Strom; Celine M Vachon; J Brice Weinberg; Elizabeth A Holly; Stephen Chanock; Martyn T Smith; Paige M Bracci; Christine F Skibola
Journal:  Nat Genet       Date:  2010-07-18       Impact factor: 38.330

View more
  100 in total

Review 1.  Familial predisposition and genetic risk factors for lymphoma.

Authors:  James R Cerhan; Susan L Slager
Journal:  Blood       Date:  2015-09-24       Impact factor: 22.113

2.  A genome-wide association study identifies multiple susceptibility loci for chronic lymphocytic leukemia.

Authors:  Helen E Speedy; Maria Chiara Di Bernardo; Georgina P Sava; Martin J S Dyer; Amy Holroyd; Yufei Wang; Nicola J Sunter; Larry Mansouri; Gunnar Juliusson; Karin E Smedby; Göran Roos; Sandrine Jayne; Aneela Majid; Claire Dearden; Andrew G Hall; Tryfonia Mainou-Fowler; Graham H Jackson; Geoffrey Summerfield; Robert J Harris; Andrew R Pettitt; David J Allsup; James R Bailey; Guy Pratt; Chris Pepper; Chris Fegan; Richard Rosenquist; Daniel Catovsky; James M Allan; Richard S Houlston
Journal:  Nat Genet       Date:  2013-12-01       Impact factor: 38.330

Review 3.  The molecular pathogenesis of chronic lymphocytic leukaemia.

Authors:  Giulia Fabbri; Riccardo Dalla-Favera
Journal:  Nat Rev Cancer       Date:  2016-03       Impact factor: 60.716

4.  Fast Algorithms for Conducting Large-Scale GWAS of Age-at-Onset Traits Using Cox Mixed-Effects Models.

Authors:  Liang He; Alexander M Kulminski
Journal:  Genetics       Date:  2020-03-04       Impact factor: 4.562

Review 5.  Evolving understanding of the CLL genome.

Authors:  Michaela Gruber; Catherine J Wu
Journal:  Semin Hematol       Date:  2014-05-15       Impact factor: 3.851

6.  DNArCdb: A database of cancer biomarkers in DNA repair genes that includes variants related to multiple cancer phenotypes.

Authors:  Pavel Silvestrov; Sarah J Maier; Michelle Fang; G Andrés Cisneros
Journal:  DNA Repair (Amst)       Date:  2018-07-31

7.  Whole exome sequencing in families with CLL detects a variant in Integrin β 2 associated with disease susceptibility.

Authors:  Lynn R Goldin; Mary L McMaster; Melissa Rotunno; Sarah E M Herman; Kristine Jones; Bin Zhu; Joseph Boland; Laurie Burdett; Belynda Hicks; Sarangan Ravichandran; Brian T Luke; Meredith Yeager; Laura Fontaine; Alisa M Goldstein; Stephen J Chanock; Margaret A Tucker; Adrian Wiestner; Gerald Marti; Neil E Caporaso
Journal:  Blood       Date:  2016-09-14       Impact factor: 22.113

8.  Somatic mutation as a mechanism of Wnt/β-catenin pathway activation in CLL.

Authors:  Lili Wang; Alex K Shalek; Mike Lawrence; Ruihua Ding; Jellert T Gaublomme; Nathalie Pochet; Petar Stojanov; Carrie Sougnez; Sachet A Shukla; Kristen E Stevenson; Wandi Zhang; Jessica Wong; Quinlan L Sievers; Bryan T MacDonald; Alexander R Vartanov; Natalie R Goldstein; Donna Neuberg; Xi He; Eric Lander; Nir Hacohen; Aviv Regev; Gad Getz; Jennifer R Brown; Hongkun Park; Catherine J Wu
Journal:  Blood       Date:  2014-04-28       Impact factor: 22.113

9.  Comprehensive Evaluation of Medical Conditions Associated with Risk of Non-Hodgkin Lymphoma using Medicare Claims ("MedWAS").

Authors:  Eric A Engels; Ruth Parsons; Caroline Besson; Lindsay M Morton; Lindsey Enewold; Winnie Ricker; Elizabeth L Yanik; Hannah Arem; April A Austin; Ruth M Pfeiffer
Journal:  Cancer Epidemiol Biomarkers Prev       Date:  2016-04-26       Impact factor: 4.254

10.  Genome-wide association study for age-related hearing loss (AHL) in the mouse: a meta-analysis.

Authors:  Jeffrey Ohmen; Eun Yong Kang; Xin Li; Jong Wha Joo; Farhad Hormozdiari; Qing Yin Zheng; Richard C Davis; Aldons J Lusis; Eleazar Eskin; Rick A Friedman
Journal:  J Assoc Res Otolaryngol       Date:  2014-02-26
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.