| Literature DB >> 28333195 |
Outi Uimari1,2, Nilufer Rahmioglu1,2, Dale R Nyholt3, Katy Vincent1, Stacey A Missmer4, Christian Becker1, Andrew P Morris2,5, Grant W Montgomery6, Krina T Zondervan1,2.
Abstract
Study question: Do genome-wide association study (GWAS) data for endometriosis provide insight into novel biological pathways associated with its pathogenesis? Summary answer: GWAS analysis uncovered multiple pathways that are statistically enriched for genetic association signals, analysis of Stage A disease highlighted a novel variant in MAP3K4, while top pathways significantly associated with all endometriosis and Stage A disease included several mitogen-activated protein kinase (MAPK)-related pathways. What is known already: Endometriosis is a complex disease with an estimated heritability of 50%. To date, GWAS revealed 10 genomic regions associated with endometriosis, explaining <4% of heritability, while half of the heritability is estimated to be due to common risk variants. Pathway analyses combine the evidence of single variants into gene-based measures, leveraging the aggregate effect of variants in genes and uncovering biological pathways involved in disease pathogenesis. Study design size, duration: Pathway analysis was conducted utilizing the International Endogene Consortium GWAS data, comprising 3194 surgically confirmed endometriosis cases and 7060 controls of European ancestry with genotype data imputed up to 1000 Genomes Phase three reference panel. GWAS was performed for all endometriosis cases and for Stage A (revised American Fertility Society (rAFS) I/II, n = 1686) and B (rAFS III/IV, n = 1364) cases separately. The identified significant pathways were compared with pathways previously investigated in the literature through candidate association studies. Participants/materials, setting, methods: The most comprehensive biological pathway databases, MSigDB (including BioCarta, KEGG, PID, SA, SIG, ST and GO) and PANTHER were utilized to test for enrichment of genetic variants associated with endometriosis. Statistical enrichment analysis was performed using the MAGENTA (Meta-Analysis Gene-set Enrichment of variaNT Associations) software. Main results and the role of chance: The first genome-wide association analysis for Stage A endometriosis revealed a novel locus, rs144240142 (P = 6.45 × 10-8, OR = 1.71, 95% CI = 1.23-2.37), an intronic single-nucleotide polymorphism (SNP) within MAP3K4. This SNP was not associated with Stage B disease (P = 0.086). MAP3K4 was also shown to be differentially expressed in eutopic endometrium between Stage A endometriosis cases and controls (P = 3.8 × 10-4), but not with Stage B disease (P = 0.26). A total of 14 pathways enriched with genetic endometriosis associations were identified (false discovery rate (FDR)-P < 0.05). The pathways associated with any endometriosis were Grb2-Sos provides linkage to MAPK signaling for integrins pathway (P = 2.8 × 10-5, FDR-P = 3.0 × 10-3), Wnt signaling (P = 0.026, FDR-P = 0.026) and p130Cas linkage to MAPK signaling for integrins pathway (P = 6.0 × 10-4, FDR-P = 0.029); with Stage A endometriosis: extracellular signal-regulated kinase (ERK)1 ERK2 MAPK (P = 5.0 × 10-4, FDR-P = 5.0 × 10-4) and with Stage B endometriosis: two overlapping pathways that related to extracellular matrix biology-Core matrisome (P = 1.4 × 10-3, FDR-P = 0.013) and ECM glycoproteins (P = 1.8 × 10-3, FDR-P = 7.1 × 10-3). Genes arising from endometriosis candidate gene studies performed to date were enriched for Interleukin signaling pathway (P = 2.3 × 10-12), Apoptosis signaling pathway (P = 9.7 × 10-9) and Gonadotropin releasing hormone receptor pathway (P = 1.2 × 10-6); however, these pathways did not feature in the results based on GWAS data. Large scale data: Not applicable. Limitations, reasons for caution: The analysis is restricted to (i) variants in/near genes that can be assigned to pathways, excluding intergenic variants; (ii) the gene-based pathway definition as registered in the databases; (iii) women of European ancestry. Wider implications of the findings: The top ranked pathways associated with overall and Stage A endometriosis in particular involve integrin-mediated MAPK activation and intracellular ERK/MAPK acting downstream in the MAPK cascade, both acting in the control of cell division, gene expression, cell movement and survival. Other top enriched pathways in Stage B disease include ECM glycoprotein pathways important for extracellular structure and biochemical support. The results highlight the need for increased efforts to understand the functional role of these pathways in endometriosis pathogenesis, including the investigation of the biological effects of the genetic variants on downstream molecular processes in tissue relevant to endometriosis. Additionally, our results offer further support for the hypothesis of at least partially distinct causal pathophysiology for minimal/mild (rAFS I/II) vs. moderate/severe (rAFS III/IV) endometriosis. Study funding/competing interest(s): The genome-wide association data and Wellcome Trust Case Control Consortium (WTCCC) were generated through funding from the Wellcome Trust (WT084766/Z/08/Z, 076113 and 085475) and the National Health and Medical Research Council (NHMRC) of Australia (241944, 339462, 389927, 389875, 389891, 389892, 389938, 443036, 442915, 442981, 496610, 496739, 552485 and 552498). N.R. was funded by a grant from the Medical Research Council UK (MR/K011480/1). A.P.M. is a Wellcome Trust Senior Fellow in Basic Biomedical Science (grant WT098017). All authors declare there are no conflicts of interest.Entities:
Keywords: MAPK signaling; disease subtypes; endometriosis; genetics; genome-wide association; pathway analysis
Mesh:
Year: 2017 PMID: 28333195 PMCID: PMC5400041 DOI: 10.1093/humrep/dex024
Source DB: PubMed Journal: Hum Reprod ISSN: 0268-1161 Impact factor: 6.918
Independent signals from overall, Stage A and Stage B GWAS results with P < 1 × 10−6.
| Rsid (Chr:Position) | A1/A2 (MAF) | Overall | Overall OR (95% CI) | Stage A | Stage A OR (95% CI) | Stage B | Stage B OR (95% CI) | Variant type | Nearest gene (distance) | Regulatory function from ENCODE (±25 Kb)* |
|---|---|---|---|---|---|---|---|---|---|---|
| rs6908034 (6:19773930) | G/A (0.16) | 5.36 × 10−7 | 1.22 (1.12–1.32) | 0.012 | 1.13 (1.02–1.25) | 7.31 × 10−7 | 1.30 (1.17–1.45) | Intronic SNP | ID4 (64 056bp) | Located in an anti-sense RNA, RP1-167F1.2 |
| rs12700667 (7:25901639) | G/A (0.25) | 5.57 × 10−7 | 1.17 (1.09–1.25) | 0.038 | 1.07 (0.98–1.16) | 2.45 × 10−9 | 1.32 (1.20–1.46) | Intergenic SNP | NFE2L3 (290 221bp) | (1) Near a microRNA, mir148a (87 900bp), (2) In histone modification marks H3K27AC, H3K4Me1, H3K4Me3, (3) In/near TFB sites for MXI1, POLR2A, TBP, NFYA, ARID3A, GATA3, ELF1, TEAD4, JUND, SMARCA4, SIX5, MAX, NRF1, RFX5, CHD2, CREB1, CEBPC, ATF1, KDM5B, JUN, NFYB, RUNX3, SP4, MAZ, SIN3A, ZBTB7A, MYC, STAT3, HMGN3, CCNT2, CBX3, TCF3, BHLHE40, EP300, E2F6, FOXP2, GABPA, ZNF143, SPI1, USF1, EGR1, E2F4, E2F1, MAFK, TCF7L2, POU2F2, TAF1, PHF8, IRF1, FOXA1 (±1 Kb) |
| rs55938609 (1:22470451) | G/C (0.16) | 6.11 × 10−7 | 1.24 (1.15–1.35) | 2.36 × 10−3 | 1.17 (1.06–1.29) | 8.33 × 10−7 | 1.33 (1.20–1.48) | SNP Upstream of gene | WNT4 (932bp) | (1) In histone modification mark H3K4Me1, (2) In/near TFB sites for EZH2, FOXA2, FOXA1, EZH2, RAD21, CTCF, PAX5, POLR2A, E2F1, EGR1, CCNT2, SIN3A, RBBP5 (±1 Kb) |
| rs138913144 (9:133897939) | A/ATATT (0.07) | 6.38 × 10−7 | 0.77 (0.69–0.87) | 4.47 × 10−3 | 0.86 (0.74–1.00) | 9.94 × 10−7 | 0.68 (0.57–0.81) | Intronic Insertion | LAMC3 (0bp) | (1) Near a small nucleolar RNA, SNORA31 (1246bp), (2) Near histone modification mark H3K4Me1 |
| rs116175374 (2:31425185) | G/A (0.04) | 6.45 × 10−7 | 0.69 (0.59–0.81) | 4.03 × 10−5 | 0.68 (0.55–0.85) | 1.79 × 10−3 | 0.73 (0.58–0.92) | Intronic SNP | CAPN14 (0bp) | (1) Near histone modification mark H3K4Me1 |
| rs60966186 (8:6831204) | A/G (0.19) | 9.85 × 10−7 | 0.85 (0.79–0.92) | 6.71 × 10−5 | 0.87 (0.79–0.96) | 3.62 × 10−5 | 0.84 (0.75–0.93) | Intergenic SNP | DEFA1 (4088bp) | (1) Near histone modification mark H3K4Me1, (2) Near TFB site for KAP1 (±1 Kb) |
| rs144240142 (6:161503024) | T/C (0.01) | 5.79 × 10−5 | 1.46 (1.10–1.93) | 6.45 × 10−8 | 1.71 (1.23–2.37) | 0.086 | 1.19 (0.79–1.78) | Intronic SNP | MAP3K4 (0bp) | (1) Transcribed on seven cell lines assayed by RNA-seq data, (2) Near TFB sites for EGR1, CEBPB, FOSL1, FOS (±1 Kb) |
| rs200922190 (1:193203491) | A/AAATTAT (0.22) | 2.35 × 10−4 | 0.90 (0.84–0.97) | 1.79 × 10−7 | 0.83 (0.75–0.91) | 0.097 | 0.97 (0.88–1.07) | Intronic Insertion | CDC73 (0bp) | (1) Transcribed on seven cell lines assayed by RNA-seq data, (2) Near TFB sites for TCF7L2, SETDB1, FOXA1, KAP1, TFAP2A, TFAP2C, FOS, ELF1, FAM48A (±1 Kb).(3) Near histone modification mark H3K4Me1, H3K27Ac |
| 8:2806920 | G/GAAAGAAAAGAAAAGAAAAG (0.17) | 0.73 | 1.36 (0.43–4.30) | 3.82 × 10−7 | 0.79 (0.71–0.88) | 0.14 | 0.94 (0.84–1.05) | Intronic deletion | CSMD1 (0bp) | (1) Near histone modification mark H3K4Me1, (2) Near TFB sites for MAX, FOXP2, REST. |
| rs855965 (10:119443759) | G/A (0.34) | 1.23 × 10−3 | 0.89 (0.84–0.95) | 4.10 × 10−7 | 0.81 (0.74–0.88) | 0.96 | 1.00 (0.92–1.09) | Intergenic SNP | EMX2 (1 34 702bp) | (1) Near histone modification mark H3K4Me1, (2) Near TFB sites for STAT3, CTCF, RAD21, SMC3, ESR1 |
| rs113850637 (3:103850400) | C/T (0.16) | 4.56 × 10−4 | 1.15 (1.06–1.25) | 8.08 × 10−7 | 1.28 (1.16–1.41) | 0.60 | 1.01 (0.90–1.13) | Intergenic SNP | ALCAM (1 235 853bp) | (1) Near a microRNA, mir548a3 (53076), (2) Near histone modification mark H3K4Me1, (3) Near TFB sites for EP300, GATA2, JUN, FOS, MAX, USF1, YY1, CTCF, TCF12, POU5F1 |
| rs12700667 (7:25901639) | G/A (0.25) | 5.57 × 10−7 | 1.17 (1.09–1.25) | 0.038 | 1.07 (0.98–1.16) | 2.45 × 10−9 | 1.32 (1.20–1.46) | Intergenic SNP | NFE2L3 (290 221bp) | (1) Near a microRNA, mir148a (87900 bp),(2) In histone modification marks H3K27AC, H3K4Me1, H3K4Me3, (3) In/near TFB sites for MXI1, POLR2A, TBP, NFYA, ARID3A, GATA3, ELF1, TEAD4, JUND, SMARCA4, SIX5, MAX, NRF1, RFX5, CHD2, CREB1, CEBPC, ATF1, KDM5B, JUN, NFYB, RUNX3, SP4, MAZ, SIN3A, ZBTB7A, MYC, STAT3, HMGN3, CCNT2, CBX3, TCF3, BHLHE40, EP300, E2F6, FOXP2, GABPA, ZNF143, SPI1, USF1, EGR1, E2F4, E2F1, MAFK, TCF7L2, POU2F2, TAF1, PHF8, IRF1, FOXA1 (±1 Kb) |
| rs517875 (3:174350886) | C/A (0.42) | 5.56 × 10−4 | 1.09 (1.03–1.16) | 0.35 | 1.01 (0.93–1.09) | 1.06 × 10−7 | 1.23 (1.13–1.33) | Intronic SNP | NAALADL2 (0bp) | (1) Near TFB sites for MAFK, ESR1 |
| rs7041895 (9:22162794) | A/C (0.43) | 4.81 × 10−4 | 1.12 (1.05–1.18) | 0.60 | 1.02 (0.95–1.10) | 1.06 × 10−7 | 1.26 (1.16–1.37) | Intergenic SNP | CDKN2BAS1 (41 701bp) | Near histone modification mark H3K4Me1 |
| rs1250258 (2:216300185) | C/T (0.27) | 2.48 × 10−5 | 0.89 (0.83–0.95) | 0.22 | 0.98 (0.90–1.07) | 3.48 × 10−8 | 0.81 (0.74–0.88) | Intronic SNP | FN1 (0bp) | (1) In histone modification mark H3K4Me3, H3K27Ac. (2) In/near TFB sites for POLR2A, TEAD4, TAF1, MBD4, MXI1, RBBP5, SIN3A, FOXA2, MAX, EZH2, RCOR1, MYC |
| 12:95403979 (12:95403979) | C/CT (0.11) | 3.35 × 10−6 | 1.24 (1.13–1.36) | 0.13 | 1.11 (0.98–1.24) | 2.43 × 10−7 | 1.38 (1.23–1.56) | Intergenic SNP | NR2C1 (12 026bp) | (1) Near TFB sites for TRIM28, CBX3, USF1, CTCF |
| rs71415016 (14:96443958) | T/C (0.08) | 1.16 × 10−4 | 1.19 (1.06–1.32) | 0.16 | 1.03 (0.89–1.19) | 3.00 × 10−7 | 1.38 (1.20–1.59) | Intergenic SNP | C14orf132 (61 880bp) | Near histone modification mark H3K4Me1 |
| rs62469231 (7:114031174) | G/A (0.02) | 1.10 × 10−5 | 1.44 (1.18–1.77) | 0.018 | 1.23 (0.94–1.60) | 4.95 × 10−7 | 1.77 (1.37–2.27) | Intronic SNP | FOXP2 (0bp) | Near histone modification mark H3K4Me1 |
| rs3920498 (1:22492887) | G/C (0.20) | 1.10 × 10−5 | 1.19 (1.11–1.28) | 0.034 | 1.11 (1.02–1.22) | 6.49 × 10−7 | 1.30 (1.18–1.43) | Intergenic SNP | WNT4 (22 502bp) | (1) In histone modification mark H3K4Me1, (2) In/near TFB sites for RELA, pouf2f2, ebf1 (±1 Kb). |
| rs6908034 (6:19773930) | G/A (0.16) | 0.012 | 1.13 (1.02–1.25) | 7.31 × 10−7 | 1.30 (1.17–1.45) | Intronic SNP | ID4 (64 056bp) | Located in an anti-sense RNA, RP1-167F1.2 | ||
| rs12455952 (18:58840518) | T/G (0.19) | 1.75 × 10−5 | 1.16 (1.07–1.25) | 0.021 | 1.09 (0.99–1.20) | 9.59 × 10−7 | 1.27 (1.15–1.40) | Intergenic SNP | CDH20 (317 287bp) | Near TFB sites for MAFK, E2F4, FOS |
| rs138913144 (9:133897939) | A/ATATT (0.07) | 4.47 × 10−3 | 0.86 (0.74–1.00) | 9.94 × 10−7 | 0.68 (0.57–0.81) | Intronic insertion | LAMC3 (0bp) | (1) Near a small nucleolar RNA, SNORA31 (1246bp), (2) Near histone modification mark H3K4Me1 | ||
TFB sites, transcription factor binding sites; rsid, SNP ID; Chr, chromosome; MAF, minor allele frequency; P, Association test P-value, OR, odds ratio; SNP, single-nucleotide polymorphism; GWAS, genome-wide association study. *Histone modification marks as identified from seven cell lines (GM12878, H1-hESC, HSMM, HUVEC, K562, NHEK, NHLF) from ENCODE database and DNAse I hypersensitivity peaks identified from 95 cell types from ENCODE database using UCSC genome browser annotation tools.
Figure 1Manhattan plot of association of single-nucleotide polymorphisms (SNPs) with Stage A endometriosis in the GWAS. Red horizontal line marks the genome-wide significance (P < 5 × 10−8), and blue line marks nominal significance (P < 5 × 10−6).
Genome-wide pathway analysis of all, Stage A and Stage B endometriosis results identified in MAGENTA.
| Database (#pathways) | Pathway | Geneset size | FDR | Bonferroni | Expected | Observed | |
|---|---|---|---|---|---|---|---|
| REACTOME (671) | Grb2-Sos provides linkage to MAPK signaling for integrins | 15 | 2.8 × 10−5 | 3 × 10−3 | NA | 1 | 6 ( |
| SA (9) | Wnt signaling | 89 | 0.026 | 0.026 | 0.24 | 4 | 9 ( |
| REACTOME (671) | p130CAS linkage to MAPK signaling for integrins | 15 | 6 × 10−4 | 0.029 | NA | 1 | 5 ( |
| ST (23) | ERK1 ERK2 MAPK pathway | 32 | 5 × 10−4 | 5 × 10−4 | 0.011 | 2 | 7 ( |
| SA (9) | TRKA receptor | 17 | 1 × 10−3 | 2.5 × 10−3 | 9.0 × 10−3 | 1 | 5 ( |
| SIG (8) | PIP3 signaling in cardiac myocytes | 67 | 1.6 × 10−3 | 4.5 × 10−3 | 0.013 | 3 | 10 ( |
| ST (23) | G alpha S pathway | 16 | 5.5 × 10−3 | 8.8 × 10−3 | 0.13 | 1 | 4 ( |
| ST (23) | Phosphoinositide three kinase pathway | 37 | 9.1 × 10−3 | 9.1 × 10−3 | 0.21 | 2 | 6 ( |
| ST (23) | Differentiation pathway in PC12 cells | 45 | 5.5 × 10−3 | 0.019 | 0.13 | 2 | 7 ( |
| SA (9) | B cell receptor complexes | 24 | 0.026 | 0.026 | 0.24 | 1 | 4 ( |
| SIG (8) | Insulin receptor pathway in cardiac myocytes | 51 | 0.033 | 0.033 | 0.26 | 2 | 6 ( |
| SA (9) | PTEN pathway | 17 | 0.051 | 0.042 | 0.46 | 1 | 3 ( |
| NABA (10) | ECM glycoproteins | 196 | 1.8 × 10−3 | 7.1 × 10−3 | 0.018 | 9 | 19 ( |
| NABA (10) | CORE Matrisome | 275 | 1.4 × 10−3 | 0.013 | 0.014 | 13 | 24 ( |
*For databases with <25 pathways, we computed the Bonferroni P-value adjusted for the number of pathways within the given database resource, in addition to the FDR multiple-testing correction computed by the MAGENTA software (see Materials and Methods). Note that the Bonferroni adjustment is likely to be conservative given that the pathways within a resource are never fully independent of each other. Databases; SigmaAldrich (SA), Signaling Gateway (SIG), Signaling Transduction KE (ST), Matrisome Project gene sets (http://web.mit.edu/hyneslab/matrisome/) (Naba ) (NABA).
Figure 2Mitogen-activated protein kinase (MAPK)-related pathways enriched for genome-wide association study (GWAS) associations with endometriosis.
Most frequently investigated biological pathways for endometriosis through candidate gene studies, as defined by PANTHER database (July 2014)
| PANTHER: Biological pathway | Candidate genes[ |
|---|---|
| Interleukin signaling pathway ( | 13 ( |
| Apoptosis signaling pathway ( | 11 ( |
| Gonadotropin releasing hormone receptor pathway ( | 12 ( |
| Inflammation mediated by chemokine/cytokine signaling ( | 12 ( |
| p53 pathway ( | 8 ( |
| Plasminogen activating cascade ( | 5 ( |
| Insulin/IGF pathway-protein kinase B signaling cascade ( | 6 ( |
| p53 pathway feedback loops 2 ( | 5 ( |
| VEGF signaling pathway ( | 5 ( |
| Angiogenesis ( | 7 ( |
| Androgen/estrogene/progesterone biosynthesis ( | 3 ( |
| Insulin/IGF pathway-mitogen-activated protein kinase kinase/MAP kinase cascade ( | 4 ( |
| Alzheimer disease-presenilin pathway ( | 6 ( |
| EGF receptor signaling pathway ( | 6 ( |
| PI3 kinase pathway ( | 4 ( |
aOut of the 122 candidate genes 65 do not participate in any of the PANTHER pathways.
Figure 3MAPK cascade and pathway components in mammalian cells. Adapted from Cabodi and Zhang and Liu (2002).