Literature DB >> 28827732

Genetic predisposition to lung cancer: comprehensive literature integration, meta-analysis, and multiple evidence assessment of candidate-gene association studies.

Junjun Wang1,2, Qingyun Liu1, Shuai Yuan1, Weijia Xie1, Yuan Liu1, Ying Xiang1,2, Na Wu1,2, Long Wu1,2, Xiangyu Ma1,2, Tongjian Cai1,2, Yao Zhang1,2, Zhifu Sun3, Yafei Li4,5.   

Abstract

More than 1000 candidate-gene association studies on genetic susceptibility to lung cancer have been published over the last two decades but with few consensuses for the likely culprits. We conducted a comprehensive review, meta-analysis and evidence strength evaluation of published candidate-gene association studies in lung cancer up to November 1, 2015. The epidemiological credibility of cumulative evidence was assessed using the Venice criteria. A total of 1018 publications with 2910 genetic variants in 754 different genes or chromosomal loci were eligible for inclusion. Main meta-analyses were performed on 246 variants in 138 different genes. Twenty-two variants from 21 genes (APEX1 rs1130409 and rs1760944, ATM rs664677, AXIN2 rs2240308, CHRNA3 rs6495309, CHRNA5 rs16969968, CLPTM1L rs402710, CXCR2 rs1126579, CYP1A1 rs4646903, CYP2E1 rs6413432, ERCC1 rs11615, ERCC2 rs13181, FGFR4 rs351855, HYKK rs931794, MIR146A rs2910164, MIR196A2 rs11614913, OGG1 rs1052133, PON1 rs662, REV3L rs462779, SOD2 rs4880, TERT rs2736098, and TP53 rs1042522) showed significant associations with lung cancer susceptibility with strong cumulative epidemiological evidence. No significant associations with lung cancer risk were found for other 150 variants in 98 genes; however, seven variants demonstrated strong cumulative evidence. Our findings provided the most updated summary of genetic risk effects on lung cancer and would help inform future research direction.

Entities:  

Mesh:

Year:  2017        PMID: 28827732      PMCID: PMC5567126          DOI: 10.1038/s41598-017-07737-0

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Introduction

Lung cancer is the most common cancer and the leading cause of cancer-related mortality around the world[1]. While smoking is the leading cause of lung cancer, genetics plays an important role as less than 20% of smokers develop this deadly disease in their lifetime[2] and non-smokers with a family history of cancer have an increased risk of lung cancer[3]. Genetic variants influencing lung-cancer risk fall into three categories: rare high-risk variants (prevalence of 1% or less), moderate-risk variants (prevalence of not more than 5%), and common low-risk variants (prevalence of more than 5%). Family-based linkage studies is most appropriate for high risk variants with high penetrance but more costly to conduct as lung cancer is a common disease and multiple occurrences of lung cancer in a family are less common. To date, the most concrete linkage and fine mapping studies reveal a lung-cancer susceptibility locus at 6q23–25 and RGS17 as a possible culprit gene[4-6]. Based on the “common disease and common variant” hypothesis, genome-wide association studies (GWAS) provide a powerful tool for investigating the genetic association of a complex disease[7]. Over the past ten years, common genetic variations at 5p15.33 (TERT/CLPTM1L), 6p21.33 (BAT3/MSH5) and 15q25.1 (CHRNA5/CHRNA3/CHRNB4) are identified to modify the lung cancer susceptibility in GWAS[8-13] and GWAS-based meta-analyses[14, 15] (eg, TERT rs2736100, CHRNA3 rs8042374, APOM rs3117582, MSH5 rs3131379, and GTF2H4 rs114596632). However, these only explain less than 10% of the risk contribution to lung cancer[16]. Candidate-gene approaches were the mainstay of genetic association studies before the GWAS era. They are relatively cost-effective and easy to perform. Over 1,000 such studies on the lung cancer susceptibility have been published for the past 25 years. However, there are a number of conflicting reports and it is very challenging to find reliable associations from these highly diverse studies. As a method for systematically integrating data from multiple studies to develop a single conclusion with greater statistical power, meta-analysis is a good way to deal with the diverse and fragmented studies. Although some meta-analyses have been performed on lung cancer, most are limited to investigating a single genetic variant, several variants in a gene, or several variants across a pathway. The recent systematic meta-analyses push the limit to all available genetic association studies in a specific disease and help to achieve a comprehensive view to the genetic contributions to the disease. Alzheimer’s disease[17], breast cancer[18], and colorectal cancer[19] are a few good examples using systematic meta-analyses with consensus outcomes. Establishing robust evidence of genetic predisposition to lung cancer risk has a potential clinical utility for not only population risk stratification but also primary prevention. The main objective of our study was to identify, consolidate, and interpret genetic associations of common variants with lung cancer using a comprehensive research synopsis and systematic meta-analysis. We attempted to systematically evaluate all published candidate-gene association studies in lung cancer following credible guidelines, which were used to guide and standardize these field synopses[20-22]. Additionally, for variants with significant associations by meta-analysis, we applied Venice criteria[21] proposed by the Human Genome Epidemiology Network (HuGENet) to assess the epidemiological credibility of cumulative epidemiological evidence of these associations, so as to obtain more reliable results. Moreover, to get a better insight of the differences in genetic variations among populations with different characters, associations stratified by ethnicity, histological types, and smoking status were also examined.

Results

Among the final 1,018 eligible publications for our meta-analysis (Fig. 1), vast majority (n = 926, 91%) were published after 1999, and 684 (67%) of these papers were published over the past decade (2006~2015) (Supplementary Fig. S1 ). A total of 2,910 genetic variants from 754 unique candidate genes or loci were eligible for further analyses. The included studies had a mean of 414 cases (range 13–4257) and 565 controls (range 12–55823). Among the 2,910 variants, 254 were reported in at least three independent datasets, and eight had been reported as the top association variants with lung cancer (P < 5 × 10−8) in published GWAS[8, 9, 23, 24]. Therefore, our meta-analyses were focused on the remaining 246 genetic variants in 138 genes or loci (Supplementary Table S1 ). More detailed information of the variants was presented in the Supplementary Results.
Figure 1

Flowchart of literature search and selection for meta-analyses for candidate-gene association studies of lung cancer.

Flowchart of literature search and selection for meta-analyses for candidate-gene association studies of lung cancer.

Main meta-analyses

For the 246 variants, we first conducted 246 main meta-analyses, one for each variant. On average, these analyses had 6,315 subjects (range 397–71120) and were combined from eight studies (range 3–133) (Supplementary Table S1). The allelic model was performed for all but nine because of insufficient available data from the original studies (Supplementary Table S1). Of the 246 main meta-analyses, 56 variants within 45 different genes showed nominally significant genetic associations with lung cancer (p-value < 0.05) (Table 1, Supplementary Table S2). The strength of association between each genetic variant and lung cancer as measured by ORs had the mean of 1.36 (range 1.08–2.55) for putative “risk” variants and 0.78 (range 0.55–0.90) for putative “protective” variants. Of the 56 main meta-analyses with significant results, 24 had little or no heterogeneity, 16 had evidence of potential bias (publication bias, small study effects, or excess significance bias), and 16 were lack of robustness based on the sensitivity analyses. More details of the results were presented in the Supplementary Results.
Table 1

Genetic variants with significant associations with lung cancer risk in main meta-analyses (Continued on next page)

GenesVariants*Frequency (%) EthnicityNumber evaluatedGenetic associations with lung cancerHeterogeneityBegg P Venice criteria grades Credibility of evidence§
StudiesCases/ControlsContrast¶ OR(95%CI)p valueI2 (%)PQ ǁ
APEX1 rs1760944(A/C)47.94All83588/3783A vs C1.16(1.08–1.25)2.85 × 10–5 90.3600.386AAAStrong
AXIN2 rs2240308(T/C)37.40All3758/742T vs C0.73(0.63–0.85)6.39 × 10−5 00.3981.000AAAStrong
CHRNA3 rs6495309(T/C)38.44All43381/4244T vs C0.83(0.77–0.89)6.55 × 10−8 00.4271.000AAAStrong
CXCR2 rs1126579(T/C)55.45All3942/964T vs C0.84(0.74–0.96)0.00900.9671.000AAAStrong
CYP2E1 rs6413432(A/T)22.17All142944/3347A vs T0.78(0.71–0.85)6.76 × 10−8 00.8210.827AAAStrong
HYKK rs931794(G/A)32.89All52435/3180G vs A1.23(1.14–1.34)1.85 × 10−7 00.8641.000AAAStrong
PON1 rs662(A/G)46.70All3995/834A vs G0.77(0.67–0.88)2.02 × 10−4 00.7011.000AAAStrong
REV3L rs462779(T/C)39.36Asian 41937/2335T vs C1.11(1.02–1.22)0.02100.9110.734AACStrong
ATM rs189037(A/G)42.68Asian 53036/3415A vs G1.09(1.00–1.18)0.050290.2270.806ABCModerate
CD3EAP rs967591(A/G)32.09All3676/726A vs G1.23(1.01–1.49)0.036220.2781.000BAAModerate
CYP2A6 rs1801272(A/T)3.99Caucasian 32411/2644carriers vs non-carriers0.66(0.52–0.84)0.00100.6741.000BABModerate
HIF1A rs11549467(A/G)9.45All3509/566A vs G2.27(1.74–2.96)1.62 × 10−9 00.4810.296BAAModerate
PDCD5 rs1862214(G/C)32.06All3737/683G vs C1.32(1.12–1.56)0.00100.3950.296BABModerate
PROM1 rs2240688(C/A)27.37Asian 32332/2457C vs A0.83(0.76–0.91)6.92 × 10−5 00.9910.296AABModerate
TP53 rs12951053(G/T)9.93All3475/569G vs T1.57(1.11–2.23)0.011370.2030.296BBBModerate
TP63 rs10937405(T/C)42.62All44927/8794T vs C0.87(0.81–0.94)2.20 × 10−4 340.2070.308ABAModerate
WWOX CNV-670482.86Asian 42942/30740 copy vs 2 copies2.06(1.58–2.70)1.20 × 10−7 00.9111.000BABModerate
XRCC1 rs3213255(G/A)38.15All31089/1506G vs A1.21(1.08–1.35)0.00100.4570.296AABModerate
AGER rs1800624(A/T)34.41Asian 31656/1693A vs T1.18(1.04–1.33)0.010160.3051.000AACWeak
BCL2 rs2279115(A/C)43.37All51847/2367A vs C0.65(0.46–0.91)0.011910.0000.624ACCWeak
CHRNA3 rs578776(T/C)31.98All31245/2009T vs C0.87(0.77–0.98)0.01800.9081.000AACWeak
CHRNA3 rs938682(C/T)28.37All31240/1986C vs T0.86(0.76–0.96)0.00900.5820.296AACWeak
CHRNA3 rs12914385(T/C)35.09All45356/2873T vs C1.20(1.01–1.44)0.044760.0070.734ACAWeak
CHRNA5 rs16969968(A/G)32.51All116222/62452A vs G1.23(1.06–1.43)0.007800.0000.119ACCWeak
CLPTM1L rs402710(T/C)32.92All137214/8051T vs C0.89(0.83–0.95)2.63 × 10−4 380.0780.669ABCWeak
CYP1A1 rs4646903(C/T)21.88All579844/12410C vs T1.16(1.07–1.25)1.59 × 10−4 550.0000.772ACCWeak
CYP1A1 rs1048943(G/A)17.83All549869/12114G vs A1.23(1.11–1.36)7.64 × 10−5 670.0000.649ACCWeak
CYP1B1 rs1056836(G/C)38.50All123033/3866G vs C1.13(1.05–1.22)0.00200.5510.064AACWeak
CYP2A6 rs5031016(C/T)9.89All31527/1138C vs T0.57(0.33–1.00)0.048730.0250.296BCCWeak
CYP2E1 rs2031920(T/C)17.33All234983/6628T vs C0.86(0.76–0.97)0.018500.0030.509ACAWeak
ELANE rs351107(G/T) (−903T > G, Rep_a)5.31Caucasian 3745/762G vs T0.55(0.34–0.87)0.011290.2461.000BBCWeak
ELANE rs7254054(A/G) (−741G > A, Rep_b)27.20Caucasian 3754/750A vs G0.77(0.61–0.97)0.030460.1550.296BBCWeak
ERCC1 rs11615(C/T)51.18All125731/7058C vs T0.90(0.83–0.99)0.023520.0180.086ACCWeak
ERCC2 rs238406(A/C)40.05All61754/2688A vs C1.12(1.02–1.23)0.01300.5580.260AACWeak
ERCC2 rs13181(C/A)25.26All4013111/16749C vs A1.12(1.05–1.19)4.18 × 10−4 490.0000.753ABCWeak
ERCC5 rs1047768(T/C)43.99All41449/2248T vs C0.86(0.74–1.00)0.049480.1230.734ABCWeak
ERCC6 rs3793784(G/C)30.82All31643/1689G vs C0.75(0.60–0.92)0.007680.0441.000ACAWeak
FGFR4 rs351855(A/G)42.47All41083/1275A vs G0.82(0.69–0.98)0.025330.2140.089ABCWeak
GSTM1 Present/null48.85All13333253/37867null vs present1.18(1.12–1.23)2.54 × 10−11 520.0000.105ACCWeak
GSTP1 rs1695(G/A)30.41All4612521/14411G vs A1.08(1.02–1.15)0.011550.0000.075ACCWeak
GSTT1 GSTT126.14All7723009/25365null vs present1.10(1.02–1.19)0.011580.0000.346ACCWeak
HRAS1 VNTR(common alleles/rare alleles)7.03Caucasian 4746/1174rare vs common2.55(1.01–6.45)0.048690.0230.734BCCWeak
IL10 rs1800896(G/A)37.18All102861/3817G vs A1.29(1.05–1.59)0.017750.0000.074ACCWeak
MAPKAPK2 CNV-304509.76Asian 32332/24804 copies vs 2 copies1.60(1.04–2.45)0.031810.0051.000BCBWeak
MDM2 rs2279744(G/T)41.05All1911076/14434G vs T1.10(1.01–1.19)0.021750.0000.700ACCWeak
MIR146A rs2910164(C/G)45.26All63158/3225C vs G1.16(1.06–1.27)0.001210.2740.260AACWeak
MMP2 rs243865(T/C)16.77All31751/1729T vs C0.63(0.45–0.89)0.009800.0070.296BCCWeak
MTRR rs1801394(G/A)43.28All31668/2291G vs A1.13(1.03–1.24)0.01100.5251.000AACWeak
NOD2 rs2066847 (3020insC/-)0.50All3807/4078carriers vs non-carriers1.42(1.07–1.90)0.01700.5931.000 × ACWeak
SFTPB wild type/ variation5.83All3157/240variation vs wild1.92(1.11–3.33)0.02000.9600.296CABWeak
SOD2 rs4880(T/C)51.48All93738/4467T vs C1.20(1.06–1.36)0.005610.0090.348ACAWeak
TERT rs2736098(A/G)33.01All74660/4825A vs G1.20(1.08–1.33)0.001670.0060.548ACBWeak
UGT1A6 rs6759892(G/T)25.10All3266/261G vs T2.27(1.14–4.53)0.020840.0021.000BCAWeak
XRCC1 rs1001581(T/C)34.52All5851/1166T vs C1.17(1.00–1.37)0.044280.2320.221ABCWeak
XRCC1 rs1799782(T/C)18.19All3011096/13772T vs C0.90(0.82–0.98)0.022620.0000.372ACCWeak
XRCC1 rs3213245(C/T)11.03All52795/2865C vs T1.29(1.04–1.59)0.020680.0140.806ACCWeak

OR = odds ratio; 95% CI = 95% confidence interval. VNTR = variable number of tandem repeats. CNV = copy number variation. ins = insertion. *Minor alleles/major alleles (per Caucasian); majors alleles were treated as reference alleles in the analyses. †Frequency of minor allele or effect genotype (s) in controls in main meta-analyses. ¶Allelic contrast or phenotype trait for common variants; genetic comparison for rare variants or variants only with genotype group data. ǁP value of the test for between-study heterogeneity. ∫Venice criteria grades are for amount of evidence, replication of the association, and protection from bias; one rare variant was not scored for amount of evidence (×). §Credibility of evidence is categorized as “strong”, “moderate”, or “weak” for association with lung cancer risk. ‡Only Asian or Caucasian data were available for meta-analysis.

Genetic variants with significant associations with lung cancer risk in main meta-analyses (Continued on next page) OR = odds ratio; 95% CI = 95% confidence interval. VNTR = variable number of tandem repeats. CNV = copy number variation. ins = insertion. *Minor alleles/major alleles (per Caucasian); majors alleles were treated as reference alleles in the analyses. †Frequency of minor allele or effect genotype (s) in controls in main meta-analyses. ¶Allelic contrast or phenotype trait for common variants; genetic comparison for rare variants or variants only with genotype group data. ǁP value of the test for between-study heterogeneity. ∫Venice criteria grades are for amount of evidence, replication of the association, and protection from bias; one rare variant was not scored for amount of evidence (×). §Credibility of evidence is categorized as “strong”, “moderate”, or “weak” for association with lung cancer risk. ‡Only Asian or Caucasian data were available for meta-analysis. The credibility assessment of the cumulative epidemiological evidence found eight genetic variants (APEX1 rs1760944, AXIN2 rs2240308, CHRNA3 rs6495309, CXCR2 rs1126579, CYP2E1 rs6413432, HYKK rs931794, PON1 rs662, and REV3L rs462779) were strong and ten were moderate (ATM rs189037, CD3EAP rs967591, CYP2A6 rs1801272, HIF1A rs11549467, PDCD5 rs1862214, PROM1 rs2240688, TP53 rs12951053, TP63 rs10937405, WWOX CNV-67048, and XRCC1 rs3213255) (Table 1, Supplementary Table S2). In the dominant genetic model analyses (Supplementary Table S1), 44 variants showed significant associations with lung cancer risk, of which seven had non-significant association in the main allelic meta-analyses yet, interestingly, two (ATM rs66467 and REV3L rs465646) showed strong and moderate cumulative epidemiological evidence, respectively (Table 2, Supplementary Table S2). Under the recessive model, 39 variants showed statistically significant associations, of which ten were non-significant under an allelic model. However, none of these showed strong cumulative epidemiologic evidence, although five variants (CASC8 rs6983267, CHRNA5 rs142774214, CYP2A6 non*4/*4, IL17A rs2275913, and XPA rs1800975) showed moderate evidence (Table 2).
Table 2

Genetic variants with significant associations with lung cancer risk under a dominant or recessive genetic model.

GenesVariantsAlleles*MAF (%)Number evaluatedGenetic associations with lung cancerHeterogeneityBegg P Venice criteria grades Credibility of evidence
StudiesCases/ControlsGenetic modelsOR(95%CI)p valueI2 (%)PQ ǁ
ATM rs664677C/T58.9031627/1641Dominant0.76(0.64–0.92)0.00400.4481.000AAAStrong
REV3L rs465646C/T18.1831296/1511Dominant0.78(0.67–0.92)0.00300.4371.000BABModerate
CASC8 rs6983267G/T44.7731539/1989Recessive1.22(1.04–1.44)0.01300.6440.296BAAModerate
CHRNA5 rs142774214ins/-37.6731431/1606Recessive0.80(0.65–0.98)0.03200.5971.000BAAModerate
CYP2A6 non*4/*4del/-13.4872623/2380Recessive0.51(0.35–0.73)2.93 × 10−4 00.5391.000BAAModerate
IL17A rs2275913A/G24.903889/998Recessive1.76(1.21–2.55)0.003180.2950.296BABModerate
XPA rs1800975A/G36.74124221/5240Recessive1.22(1.05–1.42)0.011330.1240.681ABAModerate
Chr8q24 rs16901979A/C19.4831534/1992Dominant1.18(1.02–1.37)0.02500.6101.000AACWeak
CYP1B1 rs10012G/C25.983622/666Dominant1.69(1.05–2.72)0.031740.0211.000BCCWeak
EGF rs4444903G/A59.283666/690Dominant2.07(1.01–4.24)0.048790.0090.296ACCWeak
MLH1 rs1800734A/G48.8652178/2320Dominant0.80(0.68–0.95)0.009240.2600.462AACWeak
PTGS2 rs689466G/A38.0741676/2180Dominant0.78(0.62–0.97)0.026560.0760.734ACAWeak
FASLG rs763110T/C34.0154436/4120Recessive0.83(0.70–0.99)0.038300.2210.462ABCWeak
IL1B rs1143627C/T38.8184201/5431Recessive0.80(0.68–0.95)0.010490.0590.019ABCWeak
LIG1 rs156641A/G31.7131112/2048Recessive1.45(1.14–1.83)0.00200.3701.000BACWeak
XRCC1 rs25487A/G29.704816999/20567Recessive1.16(1.03–1.30)0.018540.0000.729ACCWeak
XRCC3 rs1799794G/A41.0941389/1941Recessive0.82(0.67–0.99)0.03800.4691.000BACWeak

MAF = minor allele frequency in controls. OR = odds ratio; 95% CI = 95% confidence interval. chr = chromosome. ins = insertion. del = deletion. bp = base pair. *Minor alleles/major alleles (per Caucasian); major alleles were treated as reference alleles in the analyses; Dominant model, summary OR was estimated for subjects who carry one or two minor alleles. Recessive model, summary OR was estimated for subjects have homozygous of the minor alleles. ǁP value of the test for between-study heterogeneity. †Venice criteria grades are for amount of evidence, replication of the association, and protection from bias; one rare variant was not scored for amount of evidence (×). ‡Credibility of evidence is categorized as “strong”, “moderate”, or “weak” for association with lung cancer risk.

Genetic variants with significant associations with lung cancer risk under a dominant or recessive genetic model. MAF = minor allele frequency in controls. OR = odds ratio; 95% CI = 95% confidence interval. chr = chromosome. ins = insertion. del = deletion. bp = base pair. *Minor alleles/major alleles (per Caucasian); major alleles were treated as reference alleles in the analyses; Dominant model, summary OR was estimated for subjects who carry one or two minor alleles. Recessive model, summary OR was estimated for subjects have homozygous of the minor alleles. ǁP value of the test for between-study heterogeneity. †Venice criteria grades are for amount of evidence, replication of the association, and protection from bias; one rare variant was not scored for amount of evidence (×). ‡Credibility of evidence is categorized as “strong”, “moderate”, or “weak” for association with lung cancer risk.

Subgroup meta-analyses

Ethnicity

Subgroup meta-analyses were conducted in Caucasian and Asian population separately under each of the three genetic models (allelic, dominant, or recessive model) depending on the available data (Supplementary Table S3 ). We found that 19 and 26 variants were significantly associated with lung cancer susceptibility in Caucasian and Asian population, respectively. Five variants (APEX1 rs1130409, CHRNA5 rs16969968, CLPTM1L rs402710, ERCC2 rs13181, and SOD2 rs4880) showed strong and five (CYP1A2 rs762551, CYP1B1 rs1056836, CYP2A6 rs1801272, CYP2E1 rs2031920, and XRCC1 rs1799782) showed moderate evidence in the Caucasian population (Table 3, Supplementary Table S4). For the significant variants in the Asian population, strong and moderate cumulative evidence were observed in seven (APEX1 rs1760944, CLPTM1L rs402710, CYP2E1 rs6413432, MIR146A rs2910164, MIR196A2 rs11614913, REV3L rs462779, and TERT rs2736098) and seven variants (ATM rs189037, CHRNA3 rs6495309, CYP2A6 non*4/*4, GSTT1 present/null, PROM1 rs2240688, REV3L rs465646, and WWOX CNV-67048), respectively (Table 3, Supplementary Table S4). Comparing the significant variants across ethnic groups, we found that 13 variants (AGER rs1800624, ATM rs189037, CYP2A6 non*4/*4, FASLG rs763110, IL10 rs1800872, MAPKAPK2 CNV-30450, MIR196A2 rs11614913, PROM1 rs2240688, REV3L rs462779, REV3L rs465646, VEGFA rs833061, WWOX CNV-67048, and XRCC1 rs25487) were unique to the Asian population, and seven (APEX1 rs1130409, CYP1A2 rs762551, CYP2A6 rs1801272, ELANE rs351107, ELANE rs7254054, HRAS1 a VNTR variation, and MTHFR rs1801131) to Caucasian population. Four variants (CLPTM1L rs402710, CYP1A1 rs4646903, CYP1A1 rs1048943, and GSTM1 present/null) shared between the two groups, including one (CLPTM1L rs402710) showed consistent strong evidence of significant associations in both groups (Supplementary Fig. S2).
Table 3

Genetic variants with significant associations with lung cancer risk in subgroup meta-analyses with strong or moderate cumulative evidence (Continued on next page).

GeneSubgroupVariants*Number evaluatedLung-cancer risk meta-analysisHeterogeneityBegg P Venice criteria grades Credibility of evidence§
StudiesCases/ControlsGenetic modelsOR(95%CI)p valueI2 (%)PQ ǁ
APEX1 Caucasianrs1130409(G/T)71807/3065Recessive0.84(0.72–0.97)0.02100.6950.764AAAStrong
CHRNA5 Caucasianrs16969968(A/G)63305/59780Allelic1.35(1.27–1.44)2.03 × 10−21 00.9580.990AAAStrong
CLPTM1L Caucasianrs402710(T/C)41801/1908Allelic0.86(0.78–0.94)0.00200.5320.734AAAStrong
ERCC2 Caucasianrs13181(C/A)185967/8851Recessive1.15(1.04–1.29)0.009160.2580.495AAAStrong
SOD2 Caucasianrs4880(T/C)43185/3966Allelic1.17(1.10–1.25)2.24 × 10−6 00.9730.406AAAStrong
CYP1A2 Caucasianrs762551(C/A)3869/1468Recessive1.69(1.20–2.36)0.002300.2321.000BBAModerate
CYP1B1 Caucasianrs1056836(G/C)61849/2655Dominant1.18(1.04–1.34)0.01000.8560.711AABModerate
CYP2A6 Caucasianrs1801272(A/T)32411/2644Dominant0.66(0.52–0.84)0.00100.6741.000BABModerate
CYP2E1 Caucasianrs2031920(T/C)6665/1224Allelic0.61(0.42–0.90)0.01300.4560.837BABModerate
XRCC1 Caucasianrs1799782(T/C)124740/6868Allelic0.84(0.72–0.98)0.028280.1720.790ABAModerate
APEX1 Asianrs1760944(A/C)53071/3038Allelic1.20(1.12–1.29)9.14 × 10−7 00.7170.462AAAStrong
CLPTM1L Asianrs402710(T/C)85413/6143Dominant0.84(0.77–0.92)1.53 × 10−4 170.2960.711AAAStrong
CYP2E1 Asianrs6413432(A/T)61964/2085Allelic0.78(0.70–0.86)1.31 × 10−6 00.8240.707AAAStrong
MIR146A Asianrs2910164(C/G)42807/2841Recessive1.23(1.09–1.39)0.00100.5941.000AAAStrong
MIR196A2 Asianrs11614913(C/T)42376/2413Dominant1.22(1.07–1.38)0.00200.4440.308AAAStrong
REV3L Asianrs462779(T/C)41937/2335Allelic1.11(1.02–1.22)0.02100.9110.734AACStrong
TERT Asianrs2736098(A/G)53829/3992Dominant1.26(1.14–1.39)1.03 × 10−5 00.8961.000AAAStrong
ATM Asianrs189037(A/G)53036/3415Allelic1.09(1.00–1.18)0.050290.2270.806ABCModerate
CHRNA3 Asianrs6495309(T/C)32635/2767Allelic0.83(0.76–0.91)6.17 × 10−5 270.2541.000ABAModerate
CYP2A6 Asian*4/non*462517/2264Recessive0.52(0.36–0.75)0.00100.4540.707BAAModerate
GSTT1 Asiannull/present147043/5289Allelic1.15(1.03–1.28)0.010340.1050.827ABAModerate
PROM1 Asianrs2240688(C/A)32332/2457Allelic0.83(0.76–0.91)6.92 × 10−5 00.9910.296AABModerate
REV3L Asianrs465646(C/T)31296/1511Allelic0.83(0.71–0.97)0.016140.3111.000BABModerate
WWOX AsianCNV-6704842942/30740 copy vs 2 copies2.06(1.58–2.70)1.20 × 10−7 00.9111.000BABModerate
CYP1A1 SCLCrs4646903(C/T)12273/2545Recessive1.71(1.08–2.71)0.02100.9040.244BAAModerate
GSTM1 SCLCnull/present261224/7255Allelic1.30(1.09–1.56)0.004430.0101.000ABAModerate
CHRNA5 NSCLCrs16969968(A/G)63201/4736Allelic1.36(1.24–1.48)1.48 × 10−11 130.3290.707AAAStrong
CLPTM1L NSCLCrs402710(T/C)62940/4040Allelic0.85(0.79–0.91)1.13 × 10−5 00.6661.000AAAStrong
CYP2E1 NSCLCrs6413432(A/T)61290/1809Allelic0.80(0.71–0.91)4.90 × 10−4 00.8681.000AAAStrong
ERCC1 NSCLCrs11615(C/T)3780/811Allelic0.68(0.58–0.81)1.01 × 10−5 130.3160.296AAAStrong
FGFR4 NSCLCrs351855(A/G)3985/1230Allelic0.76(0.68–0.86)1.97 × 10−5 00.5901.000AAAStrong
HYKK NSCLCrs931794(G/A)41548/2464Allelic1.25(1.13–1.37)9.08 × 10−6 00.8800.734AAAStrong
MIR146A NSCLCrs2910164(C/G)4880/1094Allelic1.28(1.11–1.46)4.63 × 10−4 00.3910.734AAAStrong
TERT NSCLCrs2736098(A/G)42002/2490Allelic1.30(1.19–1.42)2.59 × 10−9 00.8180.734AAAStrong
IL17A NSCLCrs2275913(A/G)3780/998Recessive1.72(1.12–2.65)0.013310.2350.296BBBModerate
TP63 NSCLCrs10937405(T/C)33587/8484Allelic0.87(0.82–0.92)9.91 × 10−7 00.5951.000AABModerate
XPC NSCLCPAT-/ + (ins/non-ins)3967/1340Recessive1.46(1.17–1.81)0.00100.4831.000BAAModerate
XRCC1 NSCLCrs3213245(C/T)31744/2178Dominant1.50(1.29–1.75)1.89 × 10−7 00.6830.296BAAModerate
CYP2E1 ADrs6413432(A/T)6500/1809Allelic0.79(0.66–0.95)0.01100.6640.707AAAStrong
OGG1 ADrs1052133(G/C)123603/6677Recessive1.25(1.10–1.43)0.001200.2460.945AAAStrong
TERT ADrs2736098(A/G)41214/2490Allelic1.40(1.26–1.54)4.97 × 10−11 00.8910.308AAAStrong
TP53 ADrs1042522(C/G)223504/8822Recessive1.20(1.05–1.38)0.008160.2450.143AAAStrong
CHRNA5 ADrs16969968(A/G)41507/2834Allelic1.37(1.14–1.64)0.001330.2140.734ABAModerate
ERCC2 ADrs13181(C/A)4664/1230Dominant1.35(1.06–1.70)0.01300.6350.734BAAModerate
IL17A ADrs2275913(A/G)3469/998Recessive1.84(1.11–3.06)0.018360.2111.000BBBModerate
MDM2 ADrs2279744(G/T)61714/4083Recessive1.28(1.04–1.56)0.018460.0980.707ABAModerate
TP63 ADrs10937405(T/C)31158/8484Allelic0.82(0.75–0.90)2.91 × 10−5 00.8980.296AABModerate
XRCC1 ADrs3213245(C/T)3860/2178Dominant1.55(1.29–1.87)4.72 × 10−6 00.7580.296BAAModerate
CYP1A1 SCCrs4646903(C/T)171021/3959Allelic1.45(1.26–1.67)3.77 × 10−7 210.2150.232AAAStrong
CYP2E1 SCCrs6413432(A/T)6715/1809Allelic0.76(0.65–0.88)3.98 × 10−4 00.9110.260AAAStrong
APEX1 smokersrs1760944(A/C)3655/647Allelic1.37(1.11–1.69)0.003430.1741.000ABAModerate
CYP1A1 smokersrs4646903(C/T)71034/1087Allelic1.30(1.02–1.64)0.033460.0880.230BBAModerate
CYP2A6 smokers*4/non*431339/848Allelic0.71(0.59–0.85)2.30 × 10−4 130.3191.000BAAModerate
CYP2E1 smokersrs6413432(A/T)3796/791Allelic0.75(0.63–0.90)0.00220.3600.296BAAModerate
CYP2E1 smokersrs2031920(T/C)31064/1220Allelic0.76(0.65–0.90)0.00100.7270.296BAAModerate
GSTP1 smokersrs1138272(T/C)3924/1026Dominant1.63(1.28–2.08)9.17 × 10−5 00.4591.000BAAModerate
NBN smokersrs1805794(G/C)31226/1220Recessive0.83(0.71–0.98)0.03000.5540.296BAAModerate
ERCC1 non-smokersrs11615(C/T)3731/958Allelic0.85(0.72–0.99)0.04200.4491.000AAAStrong
CYP2E1 non-smokersrs6413432(A/T)5315/560Dominant0.72(0.54–0.97)0.02800.9590.806BAAModerate
CYP2E1 non-smokersrs2031920(T/C)3304/695Allelic0.70(0.54–0.90)0.00500.8631.000BAAModerate
ERCC2 non-smokersrs13181(C/A)3478/469Dominant1.88(1.36–2.58)1.11 × 10−4 00.5500.296BAAModerate
GSTM1 non-smokersnull/present321924/4718Allelic1.37(1.16–1.61)1.60 × 10−4 410.0090.212ABAModerate
TP53 non-smokersrs1042522(C/G)111882/2887Recessive1.28(1.01–1.61)0.040390.0880.586ABAModerate
XRCC1 non-smokersrs3213245(C/T)3977/1310Dominant1.43(1.17–1.75)4.56 × 10−4 00.5300.296BAAModerate

OR = odds ratio; 95%CI = 95% confidence interval. ins = insertion. del = deletion. CNV = copy number variation. SCLC = small cell lung cancer. NSCLC = non-small cell lung cancer. AD = adenocarcinoma. SCC = squamous cell carcinoma. *Minor alleles/major alleles (per Caucasian); major alleles were treated as reference alleles in the analyses. ǁP value of the test for between-study heterogeneity. ∫Venice criteria grades are for amount of evidence, replication of the association, and protection from bias. §Credibility of evidence is categorized as “strong”, “moderate”, or “weak” for association with lung cancer risk; one association with strong evidence for a variant was not considered the bias of low OR for the presence of highly consistent results across studies enrolled in meta-analysis.

Genetic variants with significant associations with lung cancer risk in subgroup meta-analyses with strong or moderate cumulative evidence (Continued on next page). OR = odds ratio; 95%CI = 95% confidence interval. ins = insertion. del = deletion. CNV = copy number variation. SCLC = small cell lung cancer. NSCLC = non-small cell lung cancer. AD = adenocarcinoma. SCC = squamous cell carcinoma. *Minor alleles/major alleles (per Caucasian); major alleles were treated as reference alleles in the analyses. ǁP value of the test for between-study heterogeneity. ∫Venice criteria grades are for amount of evidence, replication of the association, and protection from bias. §Credibility of evidence is categorized as “strong”, “moderate”, or “weak” for association with lung cancer risk; one association with strong evidence for a variant was not considered the bias of low OR for the presence of highly consistent results across studies enrolled in meta-analysis.

Histological types of lung cancer

Considering the etiologic differences of different subtypes of lung cancer, subgroup meta-analyses were performed for genetic variants with data available for non-small cell lung cancer [NSCLC], small cell lung cancer [SCLC], adenocarcinoma [AD], and squamous cell carcinoma [SCC] under each of the three genetic models (allelic, dominant, or recessive model) (Supplementary Table S5). In the NSCLC subgroup, statistical significant associations were found for 25 variants where eight variants (CHRNA5 rs16969968, CLPTM1L rs402710, CYP2E1 rs6413432, ERCC1 rs11615, FGFR4 rs351855, HYKK rs931794, MIR146A rs2910164, and TERT rs2736098) demonstrated strong cumulative epidemiological evidence (Table 3, Supplementary Table S6). In the SCLC group, five variants showed significant associations but all were moderate or weak cumulative evidence. Three significant variants (CHRNA5 rs16969968, CYP1A1 rs4646903, and GSTM1 present/null) shared between the NSCLC and SCLC group (Supplementary Fig. S3). For the AD group, 15 variants showed significant associations where four of them have strong evidence (CYP2E1 rs6413432, OGG1 rs1052133, TERT rs2736098, and TP53 rs1042522). As for SCC, two out of eight significant variants (CYP1A1 rs4646903 and CYP2E1 rs6413432) showed strong cumulative evidence. Four significant variants (CYP2E1 rs6413432, GSTM1 present/null, SOD2 rs4880, and TERT rs2736098) were shared between the AD and SCC group, including one (CYP2E1 rs6413432) showed consistent strong evidence of significant associations in both groups (Supplementary Fig. S4).

Smoking status

As for subgroup meta-analyses by smoking status, significant associations were found for twenty-two variants and ten variants in the smokers and the non-smokers, respectively. In the smoker population, the significant associations only showed moderate (APEX1 rs1760944, CYP1A1 rs4646903, CYP2A6 non*4/*4, CYP2E1 rs6413432, CYP2E1 rs2031920, GSTP1 rs1138272, and NBN rs1805794) or weak cumulative evidence, mostly due to lack of large-scale evidence and the presence of potential biases (Table 3, Supplementary Table S8). In the non-smokers populations, the significant associations had strong, moderate, or weak evidence for one (ERCC1 rs11615), six (CYP2E1 rs6413432, CYP2E1 rs2031920, ERCC2 rs13181, GSTM1 present/null, TP53 rs1042522, and XRCC1 rs3213245), and three variants, respectively. Comparing the significant variants between two groups, seventeen were unique to the smoking population, five to the non-smoking population, and five shared between the two populations (Supplementary Fig. S5).

Functional annotations

Based on main and subgroup meta-analyses, a total of 22 variants showed significant associations to lung cancer susceptibility with strong cumulative evidence. We further performed genomic annotations for these variants using HaploReg v4.1[25], which can help to predict the functional variants. Of them, twelve variants are located in exon, two in microRNA (miRNA), and the others in non-coding regions (four intronic, two intergenic, one 5′UTR, and one 3′UTR) (Table 4). Most of these variants are located within enhancer or promoter elements that are active across a wide range of tissue types (including lung cancer or normal lung tissues). Furthermore, majority of these 22 variants have been identified as expression quantitative trait loci (eQTLs) of a number of genes in various tissue types including normal lung tissues. The functional potential of ten non-synonymous SNPs were further predicted using PolyPhen-2[26]. The variant rs351855 may result in a probably damaging effect on FGFR4 function. The other non-synonymous SNPs were predicted to be “benign”.
Table 4

Functional annotation of 22 variants associated with lung cancer risk with strong evidence using HaploReg v4.1 and PolyPhen-2.

variantGene (or near gene)ǁ HaploReg v4.1 PolyPhen-2§
GERP conservedPromoter histone marksEnhancer histone marksDNAseProteins boundMotifs changedNHGRI/EBI GWAS hitsGRASP QTL hitsSelected eQTL hitsRefSeq genesdbSNP functional annotationpredicted consequence on protein functionPolyPhenscore
rs1760944 APEX1 24 tissues*14 tissues*52 tissues*11 bound proteins2 hits69 hits* OSGEP 5′UTR
rs6495309 CHRNA3 THYM4 tissuesTHYM7 altered2 hits10 hits1.4 kb 3′ of CHRNB4
rs1126579 CXCR2 BLDBLD9 altered69 hits* CXCR2 3′UTR
rs6413432 CYP2E1 4 tissuesIPSC8 altered1 hit CYP2E1 intronic
rs931794 HYKK ESDR, SKIN, BRN4 altered1 hit26 hits AGPHD1 intronic
rs664677 ATM BLD, FAT, LIV4 altered24 hits ATM intronic
rs402710 CLPTM1L 4 tissues7 tissues5 altered1 hit 1 hit1 hit CLPTM1L intronic
rs4646903 CYP1A1 SKINLNG8 hits241 bp 3′ of CYP1A1
rs2240308 AXIN2 22 tissues*23 tissues*6 tissuesSmad32 hits3 hits AXIN2 missensebenign0
rs662 PON1 conservedLNG*10 tissues*2 hits2 hits PON1 missensebenign0
rs462779 REV3L conservedBRCA1, Nkx31 hit2 hits REV3L missensebenign0
rs1130409 APEX1 20 tissues*23 tissues*4 tissuesZNF2638 hits APEX1 missensebenign0
rs16969968 CHRNA5 32 hits* CHRNA5 missensebenign0.045
rs13181 ERCC2 conservedESDR, SKIN, SPLN4 tissues4 tissues1 hit 3 hits18 hits* ERCC2 missensebenign0
rs4880 SOD2 24 tissues*19 tissues*46 tissues*CMYC,POL2, SIN3AK20CHD21 hit29 hits* SOD2 missensebenign0
rs351855 FGFR4 conserved4 tissues15 tissues*LIV5 altered2 hits15 hits FGFR4 missenseprobably damaging0.998
rs1052133 OGG1 conservedBLD, SKIN10 tissues*GATA5 hits* OGG1 missensebenign0.121
rs1042522 TP53 5 tissues9 tissues*LNG*9 altered1 hit1 hit TP53 missensebenign0.083
rs2736098 TERT 10 tissues*16 tissues*BLD9 altered1 hit1 hit* TERT synonymous
rs11615 ERCC1 conserved9 tissues21 tissues*4 tissuesZNF263EBF,Mtf12 hits5 hits ERCC1 synonymous
rs2910164 MIR146A conserved4 tissues8 tissues MIR146A
rs11614913 MIR196A2 conserved13 tissues16 tissues*8 tissues*HMG-IY1 hit6 hits MIR196A2

ǁThe gene name for the SNP, locating in a respective gene, was based on the annotation of dbSNP database (https://www.ncbi.nlm.nih.gov/snp/). The near gene name for a SNP that didn’t map into a gene region but its location nearby a gene based on the annotation of dbSNP database, and we also used this nearby gene name for the SNP in our study. ∫HaploReg v4.1: a Web server for annotation of transcription regulation for genetic variants (http://archive.broadinstitute.org/mammals/haploreg/haploreg.php). §PolyPhen-2: a Web server for annotation of potential effects on protein structure and function for non-synonymous SNPs (http://genetics.bwh.harvard.edu/pph2/). ¶The PolyPhen-2 reported a score that the calculated naive Bayes posterior probability of a given mutation being damaging ranging from 0 to 1, which was also classified as benign [0, 0.15], possibly damaging (0.15, 0.85], and probably damaging (0.85, 1], respectively. *Including regulatory evidence in lung cancer cell lines/tissues or normal lung cell lines/tissues. †GWAS for the trait of lung cancer with a P-value at 4.0 × 10−6. ‡GWAS for the trait of lung cancer with a P-value at 9.0 × 10−7.

Functional annotation of 22 variants associated with lung cancer risk with strong evidence using HaploReg v4.1 and PolyPhen-2. ǁThe gene name for the SNP, locating in a respective gene, was based on the annotation of dbSNP database (https://www.ncbi.nlm.nih.gov/snp/). The near gene name for a SNP that didn’t map into a gene region but its location nearby a gene based on the annotation of dbSNP database, and we also used this nearby gene name for the SNP in our study. ∫HaploReg v4.1: a Web server for annotation of transcription regulation for genetic variants (http://archive.broadinstitute.org/mammals/haploreg/haploreg.php). §PolyPhen-2: a Web server for annotation of potential effects on protein structure and function for non-synonymous SNPs (http://genetics.bwh.harvard.edu/pph2/). ¶The PolyPhen-2 reported a score that the calculated naive Bayes posterior probability of a given mutation being damaging ranging from 0 to 1, which was also classified as benign [0, 0.15], possibly damaging (0.15, 0.85], and probably damaging (0.85, 1], respectively. *Including regulatory evidence in lung cancer cell lines/tissues or normal lung cell lines/tissues. †GWAS for the trait of lung cancer with a P-value at 4.0 × 10−6. ‡GWAS for the trait of lung cancer with a P-value at 9.0 × 10−7.

Non-significant associations

Non-significant associations for 150 variants within 98 genes were found under any genetic model (allelic, dominant, or recessive model) in both main and subgroup meta-analyses (Supplementary Table S9). Among these 150 variants, credibility of cumulative epidemiological evidence were identified as strong, moderate, or weak for seven (ERCC1 rs16979802, ERCC1 rs2298881, ERCC1 rs735482, POLI rs3730668, PPARG rs1801282, PTGS2 rs20417, and TNF rs1799724), four (ERCC2 rs1799793, TYMS 28-bp tandem repeat, XPC rs2228000, and XRCC3 rs861539), and 139 variants, respectively (Supplementary Table S9).

Discussion

To the best of our knowledge, this systematic meta-analysis is the largest and most comprehensive assessment of currently available literatures on candidate-gene association studies in lung cancer. This study examined associations between genetic variants and lung cancer risk using data from 1,018 candidate-gene association studies including 2,910 genetic variants. The meta-analyses and evidence evaluations allowed us to identify 22 genetic variants in 21 genes with strong evidence of associations with lung cancer risk. For these variants, additional genomic annotation information provided evidence of putative regulatory functions, including regulatory histone modification marks, DNase I hypersensitivity, motif changed, and transcription factor binding in multiple cell types including lung tissue. Variants in non-coding region associated with lung cancer risk may have their effects through transcription, mRNA stability, protein structure/function, or binding sites of miRNA[27]. For example, the variant rs1760944 (−656T > G) at the 5′-promoter region of APEX1 [28] was shown as a significant variant (T vs. C allele, OR 1.16, 95%CI 1.08–1.25) with strong cumulative evidence. This variant is predicted to influence promoter histone marks in 24 tissues including lung and lung cancer cell lines. Previous in vitro promoter assay has detected that the rs1760944 T allele significantly lowered promoter activity than that of the G allele, which indicated the variant allele (T) may be associated with a low transcriptional activity of the APEX1 in lung cancer cells[28]. The variant rs6495309 in CHRNA3/B4 intergenic region[12] showed strong evidence of association with lung cancer susceptibility in our meta-analysis. This finding was consistent with the results from a previous meta-analysis performed in Chinese population[29], and a recent meta-analysis performed on the basis of GWASs of lung cancer[15]. Additional subgroup analysis of Asians in our study also showed the risk effect for the rs6495309 C allele. This SNP overlaps with promoter histone marks and alters regulatory motif. Functional study also demonstrated that the rs6495309 C allele significantly increased the CHRNA3 expression through altering the ability of CHRNA3 promoter binding to the transcriptional factor Oct-1[12]. A common genetic variation rs1126579 (C > T) located in the 3′UTR of the CXCR2 (IL8RB) was found to be associated with a reduced risk of lung cancer with strong evidence. The HaploReg tool identified that rs1126579 was an eQTL for a number of genes including CXCR2. Previous studies also reported that CXCR2 was down regulated in lung cancer tissue and might play a suppressive role in lung cancer via the p53-dependent senescence[30, 31]. Functional data indicated that the rs1126579 variant can disrupt the binding site of miR-516a-3p and further increase the expression of CXCR2[30], which may also explain why rs1126579 showed a protective effect on the risk of lung cancer. Variants falling within coding regions, especially non-synonymous SNPs, could have some effects on protein structure, function, or expression level, which may explain its association with the susceptibility of disease[32]. For example, the non-synonymous CHRNA5 rs16969968 (Asp398Asn) causes an amino acid substitution at codon 398 of the CHRNA5 protein. And the aspartic acid (Asp398) is located at the central part of the second intracellular loop in the structure of CHRNA5 protein, and was reported highly conserved across multiple species[10]. The rs1042522 (Arg72Pro) is a common functional SNP in the exon 4 of TP53, which encodes an important tumor suppressor protein. TP53 gene is often mutated in NSCLC tumors, an early event in development of lung cancer[33]. Further functional data showed that the 72Pro allele carriers of lung cancer patients may have a low frequency of the TP53 mutations in tumors[34]. The rs351855 (Gly388Arg) influences the transmembrane domain of the FGFR4 protein[35]. This SNP resides in a conserved region and causes a possibly damaging effect on protein function of FGFR4 predicted by PolyPhen. Also, rs4800 (Ala16Val) is a non-synonymous SNP in SOD2. This SNP with valine variation can reduce enzyme activity[36] and further increase oxidative stress. Rs2736098 is a synonymous SNP (Asn305Asn) in exon 2 of the TERT gene, which is a well known oncogene and encodes the catalytic subunit of the telomerase[37]. This SNP may have association with telomere length[38]. Although it does not change protein amino acid, this SNP is located within the gene regulatory elements and may alter transcription factor binding. In addition, we found two SNPs with strong evidence of associations with lung cancer risk are located in miRNA gene coding regions, rs2910164 (C > G) in the seed of miR-146a-3p encoded by MIR146A and rs11614913 (C > T) in the mature sequence of miR-196a-3p encoded by MIR196A2 [39]. Both SNPs showed significant miRNA expression differences between their alleles[39, 40] and could affect the stability of secondary hairpin structure[39]. Study also showed that rs2910164 can influence the interaction between miR-146a-3p and its potential target genes, and rs11614913 can increase the affinity of miR-196a-3p for TP53 [39]. Our subgroup analyses also provided additional important details of genetic associations in specific groups. The results of subgroup meta-analyses by ethnicity supported the well-known cognition of “racial” differences in genetic effects for complex diseases including lung cancer[41] and indicated that some variants (eg, APEX1 rs1130409, CHRNA5 rs16969968, ERCC2 rs13181, SOD2 rs4880, and CYP2E1 rs6413432) with strong evidence may be ethnic-specifically associated with lung cancer risk. Previous studies had demonstrated the existence of different genetic background in different histological subtypes of lung cancer[15, 42]. When cases were stratified according to histological types, the associations between several variants (eg, CYP2E1 rs6413432, OGG1 rs1052133, TP53 rs1042522, and CYP1A1 rs4646903) and specific subtypes of lung cancer were of strong evidence. A growing number of studies demonstrates interactions between genetic variants and smoking[43, 44]. Our subgroup analysis also found that some variants showed significant associations with lung cancer risk in smokers but not in non-smokers, for example CYP1A1 rs4646903 and GSTP1 rs1695. As the purpose of meta-analysis is not only to reveal genetic variants significantly associated with lung cancer risk, but also to identify the variants with non-significant associations. Our study revealed that 150 variants in 98 genes had non-significant associations with lung cancer risk. However, most of these variants had weak cumulative epidemiological evidence due to the presence of insufficient statistical power (119/150) and/or strong between-study heterogeneity (73/150), and only 11(7.3%) variants had strong or moderate cumulative evidence. Our results provided important clues to further assess the main effects of these variants. Despite a comprehensive and systematic approach was applied to the synopsis of genetic association studies in lung cancer, several limitations should be considered when interpreting our results. First, although available studies were searched widely and eligible studies were selected strictly according to the inclusion and exclusion criteria, it is possible that some studies might have been overlooked. Our studies didn’t include research published in the form of abstracts or in language other than English. However, for most abstracts, we also searched and included relevant studies published with whole text and reported by the same research groups. Publication biases were not identified in most meta-analyses with significant association results. Also, the proportion of studies published in language other than English is small therefore it should not have significant influence on the main results. Second, the percentage of meta-analyses with high heterogeneity (I  > 50) was more than 40% for all meta-analyses with a significant result. Although subgroup analyses stratified by ethnicity, histology, and smoking status were performed to address the heterogeneity, other sources of heterogeneity could exist and are difficult to address because of limited available data. Third, although we tried to explore the consistency and difference in genetic associations between some variants and lung cancer risk across different ethnic groups, meta-analyses stratified by ethnicity were performed only for Caucasian and Asian populations. Since very few enrolled original studies were carried out in other descent populations (e.g. African descent), the available data were not sufficient to perform subgroup meta-analyses in other descent populations. Additional association studies are needed to establish in populations of other ethnic descent for these reported variants. Finally, although we conducted systematic evaluations of cumulative epidemiological evidence for variants associated with lung cancer risk, biases cannot be completely excluded in this study. In summary, our comprehensive research synopsis and meta-analysis identified 22 variants in 21 genes had strong cumulative epidemiological evidence of significant associations with lung cancer risk. While, among variants without significant associations with lung cancer, seven had strong evidence. Our findings provided useful data and important references for the future studies to evaluate the genetic role in the field of lung cancer. The identification of genetic variants with robust association to lung cancer may help us to get more precise estimate of population risk stratification and potential target population for primary prevention.

Methods

Selection criteria and search strategies

All methods were in accordance with the PRISMA statement, the HuGE Review Handbook (version1.0) guiding genetic reviews specifically, and Meta-analysis Of Observational Studies in Epidemiology (MOOSE) guidelines[20–22, 45]. A study for inclusion had to meet the following four criteria: (1) it evaluated the association between a genetic polymorphism and lung cancer risk using a case-control, cohort, or a cross-sectional design in human; (2) lung cancer cases were diagnosed by pathological and/or histological examination; (3) it was published in a peer-reviewed scientific journal or online in English; (4) it provided sufficient information of genotype and/or allelic distributions for both cases and controls. We excluded studies with a family-based design and loci with genome-wide significant (P < 5 × 10−8) identified by GWAS since they have been replicated by many studies. To identify all published association studies potentially eligible for inclusion in our meta-analysis, we performed a comprehensive literature search (Fig. 1). Two electronic databases (PubMed and EMBASE) were queried with the terms “lung cancer (as well as synonyms of lung cancer) AND associate*” on or before December 31, 2014. This search yielded 41,457 publications, and then screened respectively for eligibility using the title, abstract, or full-paper, as necessary. For publications between December 31, 2014 and November 1, 2015, we searched databases (PubMed and EMBASE) monthly using the previous search terms and the additional terms of “lung cancer AND [gene/loci names identified in enrolled publications]”. This second search identified 4,453 additional potential publications. Furthermore, we screened for bibliographies in reviews, published meta-analyses, and cited articles from the retrieved publications. Taken together, a total of 1,018 eligible papers were finally selected and their full-text versions were carefully reviewed for further analyses (Fig. 1).

Data management and abstraction

When multiple publications used the same or overlapping data sets, we kept the data with the largest population or most recent ones as recommended by Little et al.[46]. Forty three publications with redundant information were then excluded. Using standard data extraction forms, we extracted the detailed publication information, study design, characteristics of participants, gene and variant information. Subgroup information (ancestry, smoking status, or histological types) were also separately extracted from each study whenever possible. Ancestry was divided into four general groups (African, Asian, Caucasian, and other/mixed) based on ancestry of at least 80% of the subjects[41]. If no details of ethnicity were reported, the determination was made based on the general population of the country or region where the study was done[41]. When a publication reported data from multi-racial groups, data for each population were extracted and analyzed separately if possible. To avoid the variant nomenclature confusion from different articles, we used the most current gene names and uniform identifiers (“rs” number) of variants in a public single nucleotide polymorphism (SNP) database (dbSNP, http://www.ncbi.nlm.nih.gov/projects/SNP/index.html), to designate the reported variants. For articles with “rs” number, we used as it was; for these without we used bioinformatics tools such as NCBI Blast (http://www.ncbi.nlm.nih.gov/BLAST/) and UCSC In-Silico PCR (http://genome.ucsc.edu/cgi-bin/hgPcr) to find “rs” number for the reported variant; for the remaining without any “rs” number, we used the common nomenclature (eg, MPG Arg59Cys according to amino acid substitution and GSTM1 present/null according to phenotype change) in the original articles.

Statistical analysis

All statistical analyses were performed using Stata software (version 12.0, StataCorp 2011, TX, USA), except where indicated otherwise. All tests were two-sided and considered statistically significant when p value was at 0.05 or lower, unless otherwise stated. All variants from at least three data sources were selected for meta-analysis[18]. Association between a variant and lung cancer risk was assessed by study-specific crude odds ratios (ORs) and 95% confidence intervals (CIs) using a DerSimonian and Laird random-effects model[47]. The initial main meta-analyses assessed the variant effect using an allelic genetic model (minor allele vs. major allele) without stratification. For the variation not in the form of single nucleotide substitution, a conventional comparison from the publications was used to assess the effects (eg, CYP2A6 [*4 vs. non*4], MMP3 rs3025058 [5A vs. 6A], and GSTM1 [null vs. present]). When average minor allele frequency (MAF) were greater than 50%, a rare occasion where major and minor alleles are flipped in different ethnic populations, we designated the minor allele from Caucasian population in all analyses. For the variant with sufficient genotype distribution data, we performed additional analyses based on dominant and recessive genetic models. Subgroup meta-analyses were also performed by ethnicity (Caucasian and Asian), histological types (SCLC, NSCLC, AD, and SCC), and smoking status (smoking and nonsmoking), if sufficient data were available. Between-study heterogeneity was assessed by calculating the Cochran Q statistic, with a p value less than 0.10 being the significant threshold[48]. We also used I heterogeneity metric to assess the heterogeneity[49]. Generally, I  < 25%, 25%-50% and > 50% showed mild, moderate, and strong heterogeneity, respectively. The publication bias of studies was evaluated by funnel plot analysis (logOR against standard error) and Begg’s test[50]. Potential small study effect (a trend for smaller study to show larger effect) was checked by the modified Egger’s test, which can lower the type I and type II error rates compared to the original Egger’s test[51]. We also conducted an excess significance test to examine whether there was a relative excess of formally significant findings in studies due to potential sources of bias, such as selective analyses, selective outcome reporting, or fabricated data[52]. For all variants that showed a significant association with lung cancer risk, we performed a sensitivity analysis to examine whether the significant summary ORs were robust after excluding the first published or first positive report, or excluding studies with controls violating Hardy-Weinberg equilibrium [HWE]. We used a Fisher’s exact/chi-square to assess the HWE among controls in each dataset.

Assessment of cumulative evidence

For each nominally significant results from the meta-analyses, Venice criteria was used to assess the credibility of cumulative epidemiological evidence[21]. Venice criteria is a semi-quantitative index which assigns three aspects for the amount of evidence, extent of replication, and protection from bias, and finally generates a composite assessment of “strong”, “moderate”, or “weak” epidemiological credibility for an association with lung cancer risk[21]. For the three aspects (the amount of evidence, extent of replication, and protection from bias) of Venice criteria, each aspect was assigned three levels (A, B, or C)[21]. Briefly, amount of evidence, depending on total sample size of the smallest genetic group among cases and controls in each meta-analysis, was graded as A (sample size >1000), B (sample size between 100 and 1000), or C (sample size <100). For very rare variant with frequency less than 0.5%, the amount of evidence was not assessed considering an A grade was unlikely to obtain[18]. The extent of replication, depending on between-study heterogeneity, was graded as A (I  < 25%), B (I between 25% and 50%), or C (I  > 50%). The protection from bias, considering various potential sources of bias in meta-analysis, was graded as A when there was no demonstrable bias and the bias would unlikely invalidate the association, B when there was insufficient information for identifying evidence (eg, missing information for evaluating HWE among controls in an individual study) although there was no obvious bias, and C when the bias was evident and/or was likely to explain the presence of association. More specifically, C grade was assigned if the meta-analysis had any of the following potential sources of bias: (1) the magnitude of the association was low (eg, OR <1.15 for risk effect, OR >0.87 for protective effect) with the exception of a highly consistent OR across studies enrolled in meta-analysis; (2) the sensitivity analysis indicated that the significant summary OR can be substantially changed; (3) the potential small study effect was present according to the modified Egger’s test (p-value < 0.10); (4) an excess of significant findings was possible (excess significance test, p-value < 0.10); (5) there was a potential publication bias (Begg’s test, p-value < 0.10). With the grades from three aspects, the credibility of cumulative epidemiological evidence was categorized as strong (all three aspect grades were A), moderate (any grade was B, but not C), or weak (any grade was C). Additionally, for the non-significant associations revealed by all meta-analyses, we also evaluated the credibility of cumulative epidemiological evidence based on three aspects: the degree of heterogeneity across studies, potential bias assessment, and statistical power. The statistical power was calculated by using SNP tools[53]. The credibility of cumulative epidemiological evidence of non-significant association was categorized as strong (if there was no or mild [I  < 25%] heterogeneity across studies, no demonstrable bias, and sufficient statistical power [power >90%]), weak (heterogeneity I  > 50%, or any potential bias detected, or low statistical power [power <80%]), or moderate (for other cases).

Data Availability

All data generated or analysed during this study are included in this article and its Supplementary Information file. Supplementary information file
  51 in total

1.  Quantifying heterogeneity in a meta-analysis.

Authors:  Julian P T Higgins; Simon G Thompson
Journal:  Stat Med       Date:  2002-06-15       Impact factor: 2.373

2.  A modified test for small-study effects in meta-analyses of controlled trials with binary endpoints.

Authors:  Roger M Harbord; Matthias Egger; Jonathan A C Sterne
Journal:  Stat Med       Date:  2006-10-30       Impact factor: 2.373

3.  Family history of cancer and risk of lung cancer in lifetime non-smokers and long-term ex-smokers.

Authors:  R C Brownson; M C Alavanja; N Caporaso; E Berger; J C Chang
Journal:  Int J Epidemiol       Date:  1997-04       Impact factor: 7.196

4.  The manganese superoxide dismutase Ala16Val dimorphism modulates both mitochondrial import and mRNA stability.

Authors:  Angela Sutton; Audrey Imbert; Anissa Igoudjil; Véronique Descatoire; Sophie Cazanave; Dominique Pessayre; Françoise Degoul
Journal:  Pharmacogenet Genomics       Date:  2005-05       Impact factor: 2.089

5.  Fine mapping of chromosome 6q23-25 region in familial lung cancer families reveals RGS17 as a likely candidate gene.

Authors:  Ming You; Daolong Wang; Pengyuan Liu; Haris Vikis; Michael James; Yan Lu; Yian Wang; Min Wang; Qiong Chen; Dongmei Jia; Yan Liu; Weidong Wen; Ping Yang; Zhifu Sun; Susan M Pinney; Wei Zheng; Xiao-Ou Shu; Jirong Long; Yu-Tang Gao; Yong-Bing Xiang; Wong-Ho Chow; Nat Rothman; Gloria M Petersen; Mariza de Andrade; Yanhong Wu; Julie M Cunningham; Jonathan S Wiest; Pamela R Fain; Ann G Schwartz; Luc Girard; Adi Gazdar; Colette Gaba; Henry Rothschild; Diptasri Mandal; Teresa Coons; Juwon Lee; Elena Kupert; Daniela Seminara; John Minna; Joan E Bailey-Wilson; Christopher I Amos; Marshall W Anderson
Journal:  Clin Cancer Res       Date:  2009-04-07       Impact factor: 12.531

6.  Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012.

Authors:  Jacques Ferlay; Isabelle Soerjomataram; Rajesh Dikshit; Sultan Eser; Colin Mathers; Marise Rebelo; Donald Maxwell Parkin; David Forman; Freddie Bray
Journal:  Int J Cancer       Date:  2014-10-09       Impact factor: 7.396

Review 7.  TP53 mutations in nonsmall cell lung cancer.

Authors:  Akira Mogi; Hiroyuki Kuwano
Journal:  J Biomed Biotechnol       Date:  2011-01-18

8.  Genetic variants associated with colorectal cancer risk: comprehensive research synopsis, meta-analysis, and epidemiological evidence.

Authors:  Xiangyu Ma; Ben Zhang; Wei Zheng
Journal:  Gut       Date:  2013-08-14       Impact factor: 23.059

9.  Method for evaluating multiple mediators: mediating effects of smoking and COPD on the association between the CHRNA5-A3 variant and lung cancer risk.

Authors:  Jian Wang; Margaret R Spitz; Christopher I Amos; Xifeng Wu; David W Wetter; Paul M Cinciripini; Sanjay Shete
Journal:  PLoS One       Date:  2012-10-15       Impact factor: 3.240

10.  The association between the rs6495309 polymorphism in CHRNA3 gene and lung cancer risk in Chinese: a meta-analysis.

Authors:  Min Xiao; Lei Chen; Xiaoling Wu; Fuqiang Wen
Journal:  Sci Rep       Date:  2014-10-07       Impact factor: 4.379

View more
  32 in total

Review 1.  Small-cell lung cancer.

Authors:  Charles M Rudin; Elisabeth Brambilla; Corinne Faivre-Finn; Julien Sage
Journal:  Nat Rev Dis Primers       Date:  2021-01-14       Impact factor: 52.329

2.  Cross-ancestry genome-wide meta-analysis of 61,047 cases and 947,237 controls identifies new susceptibility loci contributing to lung cancer.

Authors:  Jinyoung Byun; Younghun Han; Yafang Li; Jun Xia; Erping Long; Jiyeon Choi; Xiangjun Xiao; Meng Zhu; Wen Zhou; Ryan Sun; Yohan Bossé; Zhuoyi Song; Ann Schwartz; Christine Lusk; Thorunn Rafnar; Kari Stefansson; Tongwu Zhang; Wei Zhao; Rowland W Pettit; Yanhong Liu; Xihao Li; Hufeng Zhou; Kyle M Walsh; Ivan Gorlov; Olga Gorlova; Dakai Zhu; Susan M Rosenberg; Susan Pinney; Joan E Bailey-Wilson; Diptasri Mandal; Mariza de Andrade; Colette Gaba; James C Willey; Ming You; Marshall Anderson; John K Wiencke; Demetrius Albanes; Stephan Lam; Adonina Tardon; Chu Chen; Gary Goodman; Stig Bojeson; Hermann Brenner; Maria Teresa Landi; Stephen J Chanock; Mattias Johansson; Thomas Muley; Angela Risch; H-Erich Wichmann; Heike Bickeböller; David C Christiani; Gad Rennert; Susanne Arnold; John K Field; Sanjay Shete; Loic Le Marchand; Olle Melander; Hans Brunnstrom; Geoffrey Liu; Angeline S Andrew; Lambertus A Kiemeney; Hongbing Shen; Shanbeh Zienolddiny; Kjell Grankvist; Mikael Johansson; Neil Caporaso; Angela Cox; Yun-Chul Hong; Jian-Min Yuan; Philip Lazarus; Matthew B Schabath; Melinda C Aldrich; Alpa Patel; Qing Lan; Nathaniel Rothman; Fiona Taylor; Linda Kachuri; John S Witte; Lori C Sakoda; Margaret Spitz; Paul Brennan; Xihong Lin; James McKay; Rayjean J Hung; Christopher I Amos
Journal:  Nat Genet       Date:  2022-08-01       Impact factor: 41.307

3.  Next-generation universal hereditary cancer screening: implementation of an automated hereditary cancer screening program for patients with advanced cancer undergoing tumor sequencing in a large HMO.

Authors:  Trevor L Hoffman; Hilary Kershberg; John Goff; Kimberly J Holmquist; Reina Haque; Monica Alvarado
Journal:  Fam Cancer       Date:  2022-10-20       Impact factor: 2.446

4.  TP53 Polymorphism Contributes to the Susceptibility to Bipolar Disorder but Not to Schizophrenia in the Chinese Han Population.

Authors:  Jialei Yang; Xulong Wu; Jiao Huang; Zhaoxia Chen; Guifeng Huang; Xiaojing Guo; Lulu Zhu; Li Su
Journal:  J Mol Neurosci       Date:  2019-05-05       Impact factor: 3.444

5.  Integrating germline and somatic genetics to identify genes associated with lung cancer.

Authors:  Jack Pattee; Xiaowei Zhan; Guanghua Xiao; Wei Pan
Journal:  Genet Epidemiol       Date:  2019-12-10       Impact factor: 2.135

6.  Individual and joint contributions of genetic and methylation risk scores for enhancing lung cancer risk stratification: data from a population-based cohort in Germany.

Authors:  Haixin Yu; Janhavi R Raut; Ben Schöttker; Bernd Holleczek; Yan Zhang; Hermann Brenner
Journal:  Clin Epigenetics       Date:  2020-06-18       Impact factor: 6.551

Review 7.  Molecular mechanisms of pulmonary carcinogenesis by polycyclic aromatic hydrocarbons (PAHs): Implications for human lung cancer.

Authors:  Rachel Stading; Grady Gastelum; Chun Chu; Weiwu Jiang; Bhagavatula Moorthy
Journal:  Semin Cancer Biol       Date:  2021-07-07       Impact factor: 15.707

8.  Association Between TERT rs2736098 Polymorphisms and Cancer Risk-A Meta-Analysis.

Authors:  Mi Zhou; Bo Jiang; Mao Xiong; Xin Zhu
Journal:  Front Physiol       Date:  2018-04-11       Impact factor: 4.566

9.  Ensuring the Safety and Security of Frozen Lung Cancer Tissue Collections through the Encapsulation of Dried DNA.

Authors:  Kevin Washetine; Mehdi Kara-Borni; Simon Heeke; Christelle Bonnetaud; Jean-Marc Félix; Lydia Ribeyre; Coraline Bence; Marius Ilié; Olivier Bordone; Marine Pedro; Priscilla Maitre; Virginie Tanga; Emmanuelle Gormally; Pascal Mossuz; Philippe Lorimier; Charles Hugo Marquette; Jérôme Mouroux; Charlotte Cohen; Sandra Lassalle; Elodie Long-Mira; Bruno Clément; Georges Dagher; Véronique Hofman; Paul Hofman
Journal:  Cancers (Basel)       Date:  2018-06-11       Impact factor: 6.639

10.  Serum magnesium levels and lung cancer risk: a meta-analysis.

Authors:  Xinghui Song; Xiaoning Zhong; Kaijiang Tang; Gang Wu; Yin Jiang
Journal:  World J Surg Oncol       Date:  2018-07-12       Impact factor: 2.754

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.