| Literature DB >> 32393777 |
Xuemei Ji1, Semanti Mukherjee2, Maria Teresa Landi3, Yohan Bosse4, Philippe Joubert4, Dakai Zhu5,6, Ivan Gorlov5, Xiangjun Xiao6, Younghun Han6, Olga Gorlova5, Rayjean J Hung7, Yonathan Brhane7, Robert Carreras-Torres8, David C Christiani9,10, Neil Caporaso3, Mattias Johansson8, Geoffrey Liu7, Stig E Bojesen11,12,13, Loic Le Marchand14, Demetrios Albanes3, Heike Bickeböller15, Melinda C Aldrich16, William S Bush17, Adonina Tardon18, Gad Rennert19, Chu Chen20, Jinyoung Byun6, Konstantin H Dragnev21, John K Field22, Lambertus Fa Kiemeney23, Philip Lazarus24, Shan Zienolddiny25, Stephen Lam26, Matthew B Schabath27, Angeline S Andrew28, Pier A Bertazzi29,30, Angela C Pesatori29,30, Nancy Diao9, Li Su9, Lei Song3, Ruyang Zhang9, Natasha Leighl31, Jakob S Johansen32, Anders Mellemgaard32, Walid Saliba19, Christopher Haiman33, Lynne Wilkens14, Ana Fernandez-Somoano18, Guillermo Fernandez-Tardon18, Erik H F M van der Heijden23, Jin Hee Kim34, Michael P A Davies22, Michael W Marcus22, Hans Brunnström35, Jonas Manjer36, Olle Melander36, David C Muller37, Kim Overvad36, Antonia Trichopoulou38, Rosario Tumino39, Gary E Goodman40,41, Angela Cox42, Fiona Taylor42, Penella Woll42, Erich Wichmann43, Thomas Muley44,45, Angela Risch46, Albert Rosenberger15, Kjell Grankvist47, Mikael Johansson48, Frances Shepherd49, Ming-Sound Tsao49, Susanne M Arnold50, Eric B Haura51, Ciprian Bolca52, Ivana Holcatova53, Vladimir Janout54, Milica Kontic55, Jolanta Lissowska56, Anush Mukeria57, Simona Ognjanovic58, Tadeusz M Orlowski59, Ghislaine Scelo8, Beata Swiatkowska60, David Zaridze57, Per Bakke61, Vidar Skaug25, Lesley M Butler62, Kenneth Offit2, Preethi Srinivasan63, Chaitanya Bandlamudi64, Matthew D Hellmann2, David B Solit2,64, Mark E Robson2, Charles M Rudin2, Zsofia K Stadler2, Barry S Taylor64,65, Michael F Berger63,64, Richard Houlston66, John McLaughlin67, Victoria Stevens68, David C Nickle69, Ma'en Obeidat70, Wim Timens71, María Soler Artigas72,73, Sanjay Shete74, Hermann Brenner75, Stephen Chanock3, Paul Brennan8, James D McKay8, Christopher I Amos76,77.
Abstract
Few germline mutations are known to affect lung cancer risk. We performed analyses of rare variants from 39,146 individuals of European ancestry and investigated gene expression levels in 7,773 samples. We find a large-effect association with an ATM L2307F (rs56009889) mutation in adenocarcinoma for discovery (adjusted Odds Ratio = 8.82, P = 1.18 × 10-15) and replication (adjusted OR = 2.93, P = 2.22 × 10-3) that is more pronounced in females (adjusted OR = 6.81 and 3.19 and for discovery and replication). We observe an excess loss of heterozygosity in lung tumors among ATM L2307F allele carriers. L2307F is more frequent (4%) among Ashkenazi Jewish populations. We also observe an association in discovery (adjusted OR = 2.61, P = 7.98 × 10-22) and replication datasets (adjusted OR = 1.55, P = 0.06) with a loss-of-function mutation, Q4X (rs150665432) of an uncharacterized gene, KIAA0930. Our findings implicate germline genetic variants in ATM with lung cancer susceptibility and suggest KIAA0930 as a novel candidate gene for lung cancer risk.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32393777 PMCID: PMC7214407 DOI: 10.1038/s41467-020-15905-6
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 17.694
Allele-specific lung cancer risk for ATM-L2307F (rs56009889) and KIAA0930-Q4X (rs150665432).
| Mutation | Outcome | Population | Dataset | Crude | Adjust by PCsa | ||
|---|---|---|---|---|---|---|---|
| OR (95% CI) | OR (95% CI) | ||||||
| rs56009889 | Lung Cancer | All | Discovery | 3.98 (2.40–6.61) | 8.21E−09 | 3.15 (1.89–5.26) | 1.04E−05 |
| Replication | 1.39 (0.70–2.73) | 0.34 | 1.36 (0.69–2.68) | 0.37 | |||
| rs56009889 | Lung Cancer | Female | Discovery | 9.45 (3.39–26.34) | 1.49E−07 | 6.81 (2.42–19.17) | 0.0003 |
| Replication | 3.36 (1.22–9.26) | 0.01 | 3.19 (1.16–8.82) | 0.03 | |||
| rs56009889 | LAD | All | Discovery | 8.15 (4.86–13.68) | 2.74E−21 | 8.82 (5.18–15.03) | 1.18E−15 |
| Replication | 3.00 (1.51–5.96) | 9.76E−04 | 2.93 (1.47–5.82) | 2.22E−03 | |||
| rs150665432 | Lung Cancer | All | Discovery | 2.78 (2.27–3.39) | 1.71E−25 | 2.61 (2.15–3.18) | 7.98E−22 |
| Replication | 1.54 (0.98–2.42) | 0.06 | 1.55 (0.98–2.44) | 0.06 | |||
OR, 95% CI and P values were generated from logistic regression model.
aPCs are the principal components.
Fig. 1Regional lung cancer association plots for the ATM and KIAA0930 risk loci.
a ATM region for lung cancer risk. rs56009889, localizing to chromosome 11 and mapping within ATM, is not in linkage disequilibrium (LD) with any SNPs that have been identified before; b KIAA0930 region for lung cancer risk. rs150665432 localizes to chromosome 22 and maps within uncharacterized KIAA0930, which is not in LD with any SNPs that have been identified before. For each plot, −log10 P values (y-axis) of the SNPs are shown according to their chromosomal positions (x-axis). The top genotyped SNP in each analysis is labeled by its rs number. The color intensity of each symbol reflects the extent of LD with the top lung cancer-associated SNP in the discovery data: blue (r2 = 0) through to red (r2 = 1.0). Physical positions are based on NCBI build 37 of the human genome. The relative positions of genes are also shown. Source data are provided as a Source Data file (Source Data 1).
Lung cancer risk for the carriers of ATM-L2307F (rs56009889) and KIAA0930-Q4X (rs150665432).
| Outcome | Population | Gene | Genotype | Discovery Dataset | Replication Dataset | Meta-analysis# | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No. | Adjusteda | No. | Adjusteda | ||||||||||
| Control | Case | OR (95% CI) | Control | Case | OR (95% CI) | OR (95% CI) | |||||||
| lung cancer | All | CC | 13005 | 15767 | 1 | 5331 | 4891 | 1 | |||||
| TC | 18 | 77 | 3.79 (2.2–6.6) | 2.57E−06 | 15 | 19 | 1.31 (0.65–2.65) | 0.45 | 2.52 (1.63–3.91) | 3.18E−05 | |||
| TT | 0 | 5 | Inf (0.8–Inf) | 0.068* | 0 | 0 | – | – | – | – | |||
| TC + TT | 18 | 82 | 4.19 (2.4–7.3) | 3.56E−07 | 15 | 19 | 1.31 (0.65–2.65) | 0.45 | 2.7 (1.75–4.16) | 7.82E−06 | |||
| Trend | 2.45E−07 | – | |||||||||||
| lung cancer | Female | CC | 5096 | 5777 | 1 | 2475 | 2203 | 1 | |||||
| TC | 4 | 41 | 7.67 (2.6–22) | 0.0002 | 5 | 15 | 3.22 (1.12–9.21) | 0.03 | 4.94 (2.34–10.5) | 2.92E−05 | |||
| TT | 0 | 1 | Inf (0–Inf) | 0.49* | 0 | 0 | – | – | – | – | |||
| TC + TT | 4 | 42 | 7.76 (2.7–22) | 0.0002 | 5 | 15 | 3.22 (1.12–9.21) | 0.03 | 4.97 (2.35–10.5) | 2.67E−05 | |||
| Trend | 0.0002 | – | |||||||||||
| LAD | All | CC | 13005 | 6267 | 1 | 5331 | 2139 | 1 | |||||
| TC | 18 | 61 | 4.68 (2.7–8.2) | 7.92E−08 | 15 | 18 | 2.48 (1.22–5.04) | 0.01 | 3.66 (2.36–5.69) | 7.93E−09 | |||
| TT | 0 | 5 | Inf (1.9–Inf) | 0.004* | 0 | 0 | – | – | – | – | |||
| TC + TT | 18 | 66 | 5.23 (3–9.2) | 6.47E−09 | 15 | 18 | 2.48 (1.22–5.04) | 0.01 | 3.93 (2.53–6.1) | 9.96E−10 | |||
| Trend | 5.44E−09 | – | |||||||||||
| LAD | Female | CC | 5096 | 2923 | 1 | 2475 | 1186 | 1 | |||||
| TC | 4 | 32 | 7.91 (2.7–23) | 0.0002 | 5 | 14 | 4.69 (1.65–13.4) | 0 | 6.05 (2.86–12.79) | 2.48E−06 | |||
| TT | 0 | 1 | Inf (0–Inf) | 0.36* | 0 | 0 | – | – | – | – | |||
| TC + TT | 4 | 33 | 8.05 (2.8–23) | 0.0001 | 5 | 14 | 4.69 (1.65–13.4) | 0 | 6.1 (2.89–12.9) | 2.14E−06 | |||
| Trend | 0.0001 | – | |||||||||||
| lung cancer | All | GG | 12642 | 14814 | 1 | 5308 | 4861 | 1 | |||||
| AG | 126 | 355 | 2.41 (2–3) | 7.83E−16 | 32 | 47 | 1.69 (1.05 -2.7) | 0.03 | 2.27 (1.87–2.75) | 1.9E-16 | |||
| AA | 0 | 29 | Inf (6.3—Inf) | 2.29E−08* | 0 | 0 | – | – | – | – | |||
| AG + AA | 126 | 384 | 2.59 (2.1–3.2) | 1.15E−18 | 32 | 47 | 1.69 (1.05 -2.7) | 0.03 | 2.41 (1.99–2.92) | 3.9E-19 | |||
| Trend | 1.51E−19 | – | |||||||||||
aAdjusted for age at diagnosis/interview, gender, smoking status and PCs. #Fixed-effects meta-analysis adjusted for age at diagnosis/interview, gender, smoking status and PCs.
*Values were generated from two-sided Fisher’s Exact Test. OR, 95% CI and P values generated from logistic regression model.
Fig. 2ATM rs56009889 association with lung cancer risk.
P values were determined by logistic regression analysis adjusted by age, gender, smoking status and the principal components. a Stratified analyses of the association between rs56009889 and Lung cancer. Compared to non-carriers, L2307F carriers had an increased risk of lung cancer with ORs being 4.19 in the discovery data (P = 3.56 × 10−7, n = 28872) and 1.31 in the replication data (P = 0.45, n = 10256). In females, L2307F carriers had a lung cancer risk with ORs being 7.76 in the discovery data (P = 0.0002, n = 10919) and 3.22 in the replication data (P = 0.03, n = 4698). L2307F carriers had a significant 5.2-fold increased risk for lung adenocarcinoma (LAD) in the discovery data (P = 6.47 × 10−9, n = 19356) and a 2.5-fold increased risk in the replication data (P = 0.01, n = 7503). No associations of L2307F with the risk of lung squamous cell carcinoma (LSQ) (n = 16853) or small cell lung cancer (SCLC) (n = 14746) were observed in the discovery data. No L2307F variants were observed in LSQ or SCLC in the replication data. Colors indicate demographic and histological stratifications of the data. b Stratified analyses of the association between rs56009889 and LAD. Females who carried L2307F had a >8-fold greater risk of LAD in the discovery dataset (P = 0.0001, n = 8056) and a 4.7-fold risk of LAD in the replication data (P = 0.004, n = 3680). Never smoking females who harbored L2307F had a 7-fold greater risk of LAD in the discovery data (P = 0.01, n = 2817) and a 3.8-fold risk of LAD in the replication data (P = 0.15, n = 1212). c Distribution of L2307F homozygotes. All the homozygotes of L2307F in the discovery data, no matter what age, gender, and smoking status, developed LAD in the discovery data. No homozygotes were found in the replication data. d Higher ORs of association between rs56009889 and the risk of lung cancer, of LAD in overall and in females were found in Israeli (n = 1173) than in North Americas (n = 10858). All of the associations have reached significant. The upper 95% CI of the LAD risk in female in Israel (adjusted OR = 17.15; 95% CI 2.24–131.32, n = 373) was not shown because it was too high. Colors indicate stratifications of the data by histology and sex. The error bars are OR ± the 95% CI values. Source data are provided as a Source Data file.
Fig. 3KIAA0930 rs150665432 association with lung cancer risk.
a Stratified analyses of the association between KIAA0930 Q4X and lung cancer risk, shown by different colors. Compared to non-carriers, Q4X carriers had a significantly increased lung cancer risk with ORs being 2.59 in the discovery (P = 1.15 × 10−18, n = 27966) and 1.69 in the replication datasets (P = 0.03, n = 10248). Stratified analysis showed that Q4X carriers had an increased, consistent risk for lung cancer among females, males, smokers and non-smokers and consistent in histological subtypes. The error bars are OR ± the 95% CI values. P values were determined by logistic regression analysis adjusted by age, gender, smoking status and the principal components. Source data are provided as a Source Data file. b Distribution of KIAA0930 Q4X homozygotes. In the discovery data, all homozygotes of the mutated allele in rs150665432 were developed to lung cancer in the discovery data. Source data are provided as a Source Data file (Source Data 1, 2, and 3). Color shades indicate the histological subtypes.
Fig. 4The onset of lung cancer risk and biallelic two-hit events of ATM rs56009889.
a rs56009889 affects the age of onset. The error bars are mean + the standard error of the mean (SEM). In the discovery data, the mean age of onset for lung cancer cases carrying L2307F was significantly higher than cases of non-carriers. Later age of onset was observed for overall lung cancer (n = 15830), females (n = 5810), males (n = 10019), smokers (n = 14006), LAD (n = 6329) and females (n = 2954) with LAD. In the replication data, a borderline significant difference in the age of onset was observed only in females with LAD (n = 906) and non-smoker with LAD (n = 293) though the sample size is small. P values were determined by the two sides t test without adjustment. No carrier of the T allele developed LSQ and SCLC in the replication data. b the rate of loss of heterozygosity targeting either ATM L2306F allele or synonymous variants in ATM gene. Source data are provided as a Source Data file.
Fig. 5ATM and KIAA0930 isoforms.
a ATM has eight isoforms that produce proteins with different length. ATM has four functional domains, including TAN domain which is used for telomere-length maintenance and DNA damage repair, FAT and FATC domains, which are important regulatory domains, and Phosphoinositide 3-kinase related kinase (PIKK), which is a catalytic domain that has intrinsic serine/threonine kinase activity. b ATM isoforms expression from GTEx data (n = 427). c A heatmap showing ATM isoforms expression from GTEx data. b ATM isoforms expression from Germany data (n = 6). e ENST00000251993 is the full-length and canonical isoform of KIAA0930. rs150665432 can truncate its protein length from 409 to 3 aa. f KIAA0930 isoforms expression from GTEx data (n = 427). g A heatmap showing KIAA0930 isoforms expression from GTEx data. h KIAA0930 isoforms expression from Germany data (n = 6). Boxplots in this figure were the visualization representing three quartiles (25%, median: 50%, and 75%) of the data set that are calculated using the percentile function, and the minimum and maximum values of the data set that are not outliers. Outliers are detected using the interquartile range method. Data points are labeled as outliers if they lie 1.5 times the interquartile range above or below the end points of that range.