| Literature DB >> 30956756 |
Yafang Li1, Xiangjun Xiao1, Yohan Bossé2, Olga Gorlova3, Ivan Gorlov3, Younghun Han1, Jinyoung Byun1, Natasha Leighl4, Jakob S Johansen5, Matt Barnett6, Chu Chen6, Gary Goodman7, Angela Cox8, Fiona Taylor8, Penella Woll8, H Erich Wichmann9, Judith Manz9, Thomas Muley10, Angela Risch11,12,13, Albert Rosenberger14, Jiali Han15, Katherine Siminovitch16, Susanne M Arnold17, Eric B Haura18, Ciprian Bolca19, Ivana Holcatova20, Vladimir Janout20, Milica Kontic21, Jolanta Lissowska22, Anush Mukeria23, Simona Ognjanovic24, Tadeusz M Orlowski25, Ghislaine Scelo26, Beata Swiatkowska27, David Zaridze23, Per Bakke28, Vidar Skaug29, Shanbeh Zienolddiny29, Eric J Duell30, Lesley M Butler31, Richard Houlston32, María Soler Artigas33,34, Kjell Grankvist35, Mikael Johansson36, Frances A Shepherd37, Michael W Marcus38, Hans Brunnström39, Jonas Manjer40, Olle Melander40, David C Muller41, Kim Overvad42, Antonia Trichopoulou43, Rosario Tumino44, Geoffrey Liu45, Stig E Bojesen46,47,48, Xifeng Wu49, Loic Le Marchand50, Demetrios Albanes51, Heike Bickeböller14, Melinda C Aldrich52, William S Bush53, Adonina Tardon54, Gad Rennert55, M Dawn Teare56, John K Field38, Lambertus A Kiemeney57, Philip Lazarus58, Aage Haugen29, Stephen Lam59, Matthew B Schabath60, Angeline S Andrew61, Pier Alberto Bertazzi62,63, Angela C Pesatori63, David C Christiani64, Neil Caporaso51, Mattias Johansson45, James D McKay45, Paul Brennan45, Rayjean J Hung26, Christopher I Amos65.
Abstract
The development of cancer is driven by the accumulation of many oncogenesis-related genetic alterations and tumorigenesis is triggered by complex networks of involved genes rather than independent actions. To explore the epistasis existing among oncogenesis-related genes in lung cancer development, we conducted pairwise genetic interaction analyses among 35,031 SNPs from 2027 oncogenesis-related genes. The genotypes from three independent genome-wide association studies including a total of 24,037 lung cancer patients and 20,401 healthy controls with Caucasian ancestry were analyzed in the study. Using a two-stage study design including discovery and replication studies, and stringent Bonferroni correction for multiple statistical analysis, we identified significant genetic interactions between SNPs in RGL1:RAD51B (OR=0.44, p value=3.27x10-11 in overall lung cancer and OR=0.41, p value=9.71x10-11 in non-small cell lung cancer), SYNE1:RNF43 (OR=0.73, p value=1.01x10-12 in adenocarcinoma) and FHIT:TSPAN8 (OR=1.82, p value=7.62x10-11 in squamous cell carcinoma) in our analysis. None of these genes have been identified from previous main effect association studies in lung cancer. Further eQTL gene expression analysis in lung tissues provided information supporting the functional role of the identified epistasis in lung tumorigenesis. Gene set enrichment analysis revealed potential pathways and gene networks underlying molecular mechanisms in overall lung cancer as well as histology subtypes development. Our results provide evidence that genetic interactions between oncogenesis-related genes play an important role in lung tumorigenesis and epistasis analysis, combined with functional annotation, provides a valuable tool for uncovering functional novel susceptibility genes that contribute to lung cancer development by interacting with other modifier genes.Entities:
Keywords: epistasis; functional annotation; lung cancer; oncogenesis
Year: 2019 PMID: 30956756 PMCID: PMC6442994 DOI: 10.18632/oncotarget.26678
Source DB: PubMed Journal: Oncotarget ISSN: 1949-2553
Summary and characteristics of three independent GWAS datasets used in the study
| OncoArray | Affymetrix | GELCC | ||||
|---|---|---|---|---|---|---|
| Discovery | Replication 1 | Replication 2 | ||||
| No. Sample | Cases | Controls | Cases | Controls | Cases | Controls |
| Agea | 63.8 | 61.6 | 62.9 | 60.4 | 61.6 | 64.9 |
| Male (%) | 62.5 | 60.3 | 53.9 | 53.4 | 38.9 | 35.6 |
| Smoking status | ||||||
| Never (%) | 9.7 | 32.1 | 9.7 | 29.8 | 13.0 | 40.1 |
| Former (%) | 38.7 | 39.7 | 36.2 | 35.2 | 87.0c | 59.9c |
| Current (%) | 51.6 | 28.2 | 54.1 | 34.9 | NA | NA |
| Packyrb | 40.7 | 29.8 | 28.9 | 27.6 | NA | NA |
| Histology | ||||||
| NSCLC (%) | 73.9 | 63.3 | 70.8 | |||
| ADE (%) | 38.9 | 36.4 | 40.7 | |||
| SQC (%) | 25.1 | 19.9 | 14.3 | |||
a and b, average statistics for age and packyr (pack year) are provided. c, includes both current and former smokers.
Figure 1Carcinogenesis-related gene selection and statistical analysis strategy used in the study
A. Categorization of the molecular functions of the selected 2027 cancer-related genes (DAVID). B. Flow chart of statistical analysis strategy in the study.
Signals from interaction analysis in genotyped discovery data, replication data sets and joint analysis
| OncoArray_genotyped | Affymetrix_imputed | GELCC_imputed | Joint | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SNP1 | A1 (MAF) | GENE1 | SNP2 | A1 (MAF) | GENE2 | OR | P | OR | P | OR | P | OR | P | Q |
| rs74826777 | A (0.01) | RGL1 | rs4902626 | A (0.21) | RAD51B | 0.49 | 2.83E-07 | 0.30 | 9.63E-03 | 0.22 | 2.68E-02 | 0.46 | 3.24E-09 | 0.32 |
| rs74826777 | A (0.01) | RGL1 | rs2877496 | A (0.21) | RAD51B | 0.47 | 5.20E-08 | 0.18 | 6.71E-04 | 0.24 | 4.35E-02 | 0.43 | 1.59E-10 | 0.14 |
| rs74826777 | A (0.01) | RGL1 | rs1474960 | G (0.22) | RAD51B | 0.48 | 1.39E-07 | 0.16 | 3.56E-04 | 0.20 | 2.00E-02 | 0.43 | 3.11E-10 | 0.06 |
| rs74826777 | A (0.01) | RGL1 | rs17835218 | A (0.22) | RAD51B | 0.46 | 1.14E-08 | 0.17 | 4.66E-04 | 0.34 | 8.62E-02 | 0.42 | 6.60E-11 | 0.15 |
| rs74826777 | A (0.01) | RGL1 | rs17835244 | C (0.22) | RAD51B | 0.48 | 2.89E-08 | 0.17 | 4.84E-04 | 0.29 | 5.66E-02 | 0.44 | 3.27E-11 | 0.11 |
| rs3764240 | A (0.03) | CD109 | rs851984 | A (0.39) | ESR1 | 1.36 | 1.63E-06 | 1.41 | 9.61E-03 | 2.08 | 1.85E-02 | 1.39 | 1.01E-09 | 0.41 |
| rs3764240 | A (0.03) | CD109 | rs851983 | G (0.39) | ESR1 | 1.37 | 9.29E-07 | 1.41 | 9.84E-03 | 2.08 | 1.81E-02 | 1.39 | 1.01E-09 | 0.41 |
| rs3764240 | A (0.03) | CD109 | rs851982 | G (0.39) | ESR1 | 1.37 | 9.86E-07 | 1.43 | 7.36E-03 | 2.08 | 1.81E-02 | 1.39 | 8.30E-10 | 0.41 |
| rs7783961 | A (0.28) | CALCR | rs2505532 | A (0.41) | RET | 1.14 | 4.08E-07 | 1.10 | 9.38E-02 | 1.26 | 6.03E-02 | 1.14 | 8.30E-07 | 0.62 |
| NSCLC | ||||||||||||||
| rs74826777 | A (0.01) | RGL1 | rs2877496 | A (0.21) | RAD51B | 0.45 | 1.56E-07 | 0.16 | 1.68E-03 | 0.24 | 6.20E-02 | 0.41 | 3.81E-10 | 0.18 |
| rs74826777 | A (0.01) | RGL1 | rs1474960 | G (0.22) | RAD51B | 0.44 | 7.23E-08 | 0.13 | 6.84E-04 | 0.22 | 3.92E-02 | 0.40 | 9.71E-11 | 0.11 |
| rs9677398 | A (0.28) | THADA | rs2648875 | A (0.24) | PVT1 | 1.17 | 4.41E-07 | 1.11 | 8.62E-02 | 1.34 | 7.20E-02 | 1.16 | 9.50E-09 | 0.46 |
| rs7570751 | G (0.28) | THADA | rs2648875 | A (0.24) | PVT1 | 1.17 | 3.07E-07 | 1.11 | 8.44E-02 | 1.47 | 1.87E-02 | 1.17 | 5.46E-09 | 0.24 |
| rs6544655 | G (0.28) | THADA | rs2648875 | A (0.24) | PVT1 | 1.17 | 6.05E-07 | 1.11 | 7.29E-02 | 1.39 | 4.36E-02 | 1.17 | 4.79E-09 | 0.41 |
| rs6544657 | G (0.28) | THADA | rs2648875 | A (0.24) | PVT1 | 1.17 | 2.93E-07 | 1.11 | 8.42E-02 | 1.38 | 4.56E-02 | 1.17 | 7.91E-09 | 0.38 |
| rs1554783 | G (0.25) | SYNE1 | rs10515157 | A (0.16) | RNF43 | 0.84 | 1.78E-06 | 0.81 | 3.55E-03 | 0.69 | 3.70E-02 | 0.82 | 1.29E-08 | 0.57 |
| ADE | ||||||||||||||
| rs2131556 | A (0.21) | PTPRU | rs4646 | A (0.27) | CYP19A1 | 0.82 | 8.54E-07 | 0.87 | 8.79E-02 | 0.64 | 2.48E-02 | 0.82 | 2.70E-08 | 0.37 |
| rs1554783 | G (0.25) | SYNE1 | rs10515157 | A (0.16) | RNF43 | 0.79 | 3.04E-07 | 0.79 | 8.69E-03 | 0.52 | 5.47E-03 | 0.78 | 3.18E-09 | 0.22 |
| rs2758791 | G (0.26) | SYNE1 | rs10515157 | A (0.16) | RNF43 | 0.79 | 2.31E-07 | 0.79 | 8.68E-03 | 0.58 | 1.48E-02 | 0.78 | 4.28E-09 | 0.38 |
| SQC | ||||||||||||||
| rs6716971 | G (0.06) | BRE | rs6787614 | A (0.12) | RUVBL1 | 1.63 | 1.12E-06 | 2.12 | 2.07E-04 | 5.30 | 8.92E-02 | 1.74 | 6.03E-10 | 0.26 |
| rs1882898 | A (0.38) | FHIT | rs1705235 | C (0.05) | TSPAN8 | 1.51 | 1.45E-07 | 2.26 | 2.91E-02 | 9.95 | 9.31E-04 | 1.57 | 5.95E-09 | 0.01 |
| rs11135724 | G (0.27) | LOXL2 | rs208311 | G (0.30) | P2RX7 | 0.81 | 6.63E-07 | 0.82 | 3.76E-02 | 0.58 | 6.03E-02 | 0.81 | 6.31E-09 | 0.52 |
There are 37, 50, 33 and 67 SNP pairs with meta-analysis p value < 0.05 in replication study and only SNP pairs with interaction p value < 0.1 in both the two replication datasets are reported in Table 2. OR, p values from each individual dataset as well as the joint meta-analysis are reported, and numbers highlighted in red color indicate the overall meta-analysis p values < 1.95x10-10 (Bonferroni corrected significance cutoff).
Figure 2Imputed genetic interaction analysis in candidate regions
1-4 plots display the results at all lung cancer, NSCLC, ADE and SQC cohort, respectively. A. Interaction map with SNPs from ∼ 30 kb flanking regions using imputed genotype at discovery dataset. The X and Y axis denote the SNPs at each of the gene. The color shade indicates the change of -log10(p) of interaction p value. The signals were highlighted in grey-colored boxes. B. Signals from analysis using imputed genotype. The X and Y axis denote location (bp) of each of the SNP in one gene and Z axis displayed the -log10(p) from the interaction analysis, the plane in dash line indicated p value of 1.95x10-10. Black, blue and green color denotes the results from OncoArray, Affymetrix and GELCC imputed genotype data analysis and red color denotes the results from joint analysis combining all the datasets.
Figure 3Interaction analysis at candidate gene pairs
A. Stratified lung cancer risk analysis at the identified significant SNP pairs using genotype data in discovery OncoArray data. MAF1 and MAF2 indicate the minor allele frequency of each SNP in the pair. 0 and 1 indicate the genotype without or with at least one copy of minor allele in second SNP. P values and ORs of SNP1 in main effect association conditioned on genotype at SNP2 were displayed. B. the bar plots of risk effect at joint genotypes. In each plot, the genotype with no MAF at either locus (0/0) was used as reference group. C. Analysis of the genetic interaction effect on gene expression in human lung tissues. The interaction effects of SNP pairs on gene expression levels were evaluated using linear models. The evaluated genes and selected probes were labeled at Y-axes. The boxplots of gene expression level were plotted for each joint genotype group to display the genetic interaction effect on gene expression. Genotypes were coded in a dominant mode, i.e. 0 or 1 for absence or presence of the minor allele, respectively.
Top 5 canonical pathways involving the genes from identified genetic interactions in each lung cancer subtype
| Subset | Canonical pathways | Overlap | |
|---|---|---|---|
| ALL | Glioblastoma Multiforme Signaling (TSC1, EGF, PLCB1, FGFR2, GSK3B, EGFR) | 3.35x10-7 | 6/162 |
| HER-2 signaling (TSC1, EGF, FGFR2, GSK3B, EGFR) | 4.18x10-7 | 5/88 | |
| Gαq Signaling (CALCR, NFATC2, PLCB1, FGFR2, GSK3B) | 8.24x10-6 | 5/161 | |
| Regulation of the Epithelial-Mesenchymal Transition Pathway (NOTCH4, EGF, FGFR2, GSK3B, EGFR) | 1.79x10-5 | 5/189 | |
| ErbB Signaling (EGF, FGFR2, GSK3B, EGFR) | 2.46x10-5 | 4/98 | |
| NSCLC | Regulation of the Epithelial-Mesenchymal Transition Pathway (BCL9, EGFR, FGFR2) | 6.02x10-3 | 3/189 |
| UVB-Induced MAPK Signaling (EGFR, FGFR2) | 7.43x10-3 | 2/66 | |
| EGF Signaling (EGFR, FGFR2) | 7.87x10-3 | 2/68 | |
| Caveolar-mediated Endocytosis Signaling (EGFR, ITGA11) | 8.55x10-3 | 2/71 | |
| ErbB4 Signaling (FGFR2, YAP1) | 8.79x10-3 | 2/71 | |
| ADE | Role of CHK Proteins in Cell Cycle Checkpoint Control (BRCA1, E2F2, HUS1) | 1.12x10-4 | 3/57 |
| DNA damage-induced 14-3-3σ Signaling (BRCA1, HUS1) | 4.40x10-4 | 2/19 | |
| Glioma Signaling (E2F2, EGFR, IGF2R) | 8.60x10-4 | 3/114 | |
| Role of Oct4 in Mammalian Embryonic Stem Cell Pluripotency (BRCA1, POU5F1) | 2.59x10-3 | 2/46 | |
| Spliceosomal Cycle (U2AF1/U2AF1L5) | 3.28x10-3 | 1/2 | |
| SQC | Osteoarthritis Pathway (CASP8, FN1, GLI3, PaRX7, PPARD, SDC4, TCF7L2) | 8.84x10-6 | 7/212 |
| Protein Kinase A Signaling (AKAP12, CDC25C, DUSP10, GLI3, PDE4D, PLCB1, PTPRE, TCF7L2, TGFB2) | 1.00x10-5 | 9/401 | |
| Aryl Hydrocarbon Receptor Signaling (ATR, ESR1, ESR2, TGFB2) | 1.46x10-3 | 4/141 | |
| Inflammasome Pathway (CASP8, P2RX7) | 2.16x10-3 | 2/20 | |
| Molecular Mechanisms of Cancer (ATR, CASP8, CDC25C, GAB2, PLCB1, TGFB2) | 2.45x10-3 | 6/394 |
Fisher exact test p value is displayed to evaluate if the input genes are biologically connected rather than randomly associated. Overlap indicates the number of input genes overlapped with the number of genes in a well-known canonical pathway.
Figure 4Gene network analysis using IPA program
A.-C. displayed the top networks from NSCLC, ADE and SQC cohort, respectively. Genes labeled with pink color indicate the input candidate genes from interaction analysis. Arrow lines indicate the potential genetic interactions identified in G x G analysis. Red lines indicate those genetic interactions either achieving the significance level in the joint analysis or having multiple SNP pairs with consistent evidence for genetic interaction across discovery and replication data sets although not achieving significance level in joint analysis; blue line indicate sporadic signals for potential genetic interactions without achieving significance.