| Literature DB >> 25838990 |
Michael J McGeachie1, George L Clemmer2, Jessica Lasky-Su1, Amber Dahlin1, Benjamin A Raby1, Scott T Weiss1.
Abstract
We show here that combining two existing genome wide association studies (GWAS) yields additional biologically relevant information, beyond that obtained by either GWAS separately. We propose Joint GWAS Analysis, a method that compares a pair of GWAS for similarity among the top SNP associations, top genes identified, gene functional clusters, and top biological pathways. We show that Joint GWAS Analysis identifies additional enriched biological pathways that would be missed by traditional Single-GWAS analysis. Furthermore, we examine the similarities of six complex genetic disorders at the SNP-level, gene-level, gene-cluster-level, and pathway-level. We make concrete hypotheses regarding novel pathway associations for several complex disorders considered, based on the results of Joint GWAS Analysis. Together, these results demonstrate that common complex disorders share substantially more genomic architecture than has been previously realized and that the meta-analysis of GWAS needs not be limited to GWAS of the same phenotype to be informative.Entities:
Keywords: GWAS; Meta-analysis; Pathway enrichment; Pleiotropy; Systems genetics
Year: 2014 PMID: 25838990 PMCID: PMC4378545 DOI: 10.1016/j.gdata.2014.04.004
Source DB: PubMed Journal: Genom Data ISSN: 2213-5960
Fig. 1Schematic of Joint GWAS Analysis. In Joint GWAS Analysis, two GWAS of different diseases are compared for enrichment of top SNP hits. Common SNPs occurring prior to the point of maximum enrichment become the “Joint GWAS SNPs.” These SNPs are then mapped to genes to make the Joint GWAS gene list. From these genes, enriched pathways are computed.
Comparison of Joint GWAS gene list pathway coverage vs. Target GWAS gene list pathway coverage. For each WTCCC disease, we compare the number of NHGRI pathway clusters with significantly increased coverage by genes in enriched pathways from the Joint GWAS gene list. Zeros do not necessarily indicate no pathways identified, just that no pathways were identified with greater coverage than obtained by the Target gene list. Results are dependent upon DAVID parameters (significant thresholds, pathway providers included in the aggregation).
| Target disease | Cross disease (joint GWAS pathway coverage in excess of target GWAS pathway coverage) | ||||||
|---|---|---|---|---|---|---|---|
| Disease | NHGRI pathway clusters (n) | BD | CAD | CD | RA | T1D | T2D |
| BD | 6 | – | 1 | 0 | 1 | 0 | 0 |
| CAD | 12 | 0 | – | 1 | 0 | 1 | 0 |
| CD | 9 | 1 | 0 | – | 2 | 0 | 2 |
| RA | 5 | 0 | 0 | 1 | – | 0 | 1 |
| T1D | 3 | 0 | 0 | 1 | 1 | – | 1 |
| T2D | 9 | 2 | 1 | 1 | 0 | 2 | 0 |
Novel GWAS pathways identified for each WTCCC disease with PubMed citations returned for the conjunction of pathway and disease search terms. Highlighted cells indicate pathways where we hypothesize an association of the pathway to the disease. Other cells are included for completeness, although no hypothesis was indicated. Green highlights indicate pathways where there is evidence in the literature for an association with the WTCCC disease, pink indicates pathways where there does not seem to be evidence of a known association; orange shading indicates pathways where there is indeterminate evidence for an association. Pathway names are summarizations generated by hand, by the authors, where pathway names in grouped rows are synonyms used for PubMed searches. Search terms in quotes were quoted in their submission to PubMed and were required to appear exactly in that order in the abstracts or titles of the research articles in question. Search terms grouped horizontally represent synonymous search terms that were used to get a broader picture of the relationship of the pathway to the six diseases.
Fig. 2Enrichment of Common SNPs for each Joint GWAS Analysis at different values of M. M ranges from zero to approximately 106 k, which represents all SNPs in the GWAS after filtering down to tag SNPs at linkage disequilibrium < 0.3. All fifteen pairs of diseases are represented. Enrichment of 20 null models is shown in gray, computed using a random split of the controls and assigned an arbitrary case/control phenotype. p-Values shown are computed using a hypergeometric distribution test. For each disease pair, the maximum enrichment is highlighted with a circle.
General enrichment characteristics for each Joint GWAS Analysis. Enrichment levels are chosen by peak significance of common SNP enrichment (see Fig. 2). M is the number of SNPs that maximizes that enrichment. Joint GWAS SNPs refers to the number of common SNPs in the two GWAS occurring prior to the peak significance point. Joint GWAS genes are computed from Joint GWAS SNPs by using HG18. Joint GWAS Pathways are computed from genes by using the DAVID pathway enrichment tool, including all pathways with enrichment scores better than 0.05.
| BD vs CAD | BD vs CD | BD vs RA | BD vs T1D | BD vs T2D | CAD vs CD | CAD vs RA | CAD vs T1D | CAD vs T2D | CD vs RA | CD vs T1D | CD vs T2D | RA vs T1D | RA vs T2D | T1D vs T2D | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Shared tag SNPs | 105,881 | 105,835 | 106,622 | 105,933 | 105,889 | 10,5903 | 106,722 | 106,015 | 105,957 | 106,772 | 105,974 | 105,913 | 106,773 | 106,698 | 106,005 |
| M | 13,439 | 15,041 | 15,348 | 14,182 | 19,823 | 25,246 | 14,233 | 16,175 | 16,638 | 17,121 | 14,721 | 22,039 | 16,093 | 16,716 | 18,518 |
| Joint GWAS SNPs, Nsnp | 2791 | 3315 | 3423 | 3185 | 5186 | 7711 | 3160 | 3915 | 4121 | 3963 | 3278 | 6113 | 3926 | 3950 | 4800 |
| Joint GWAS genes, Ng | 690 | 755 | 669 | 737 | 684 | 591 | 755 | 1133 | 716 | 633 | 762 | 678 | 693 | 694 | 706 |
| Joint GWAS pathways | 450 | 556 | 520 | 507 | 395 | 288 | 524 | 626 | 489 | 547 | 562 | 472 | 516 | 507 | 432 |
Comparison of Joint GWAS SNP list vs. Target GWAS SNP list. For each of the six WTCCC diseases, this table shows the number of SNPs identified by all published GWAS of that disease and indexed in the NHGRI catalog. For each WTCCC disease, we compare the number of NHGRI SNPs identified in the Joint GWAS SNP list (leading the slash) to the number identified in the Target GWAS SNP list (trailing the slash). SNPs within linkage disequilibrium of r2 ≥ 0.3 are considered representative of the SNP in question from the NHGRI list. In parentheses, we show how many more NHGRI SNPs were identified by the Joint GWAS SNP list than by the Target GWAS SNP list, as a percent of the total number of NHGRI SNPs. Negative numbers indicate that more NHGRI SNPs were identified by single, Top N Target GWAS than by Joint GWAS.
| Target disease | Cross disease (joint GWAS SNP list/target GWAS SNP list, (% gain)) | ||||||
|---|---|---|---|---|---|---|---|
| Disease | NHGRI SNPs | BD | CAD | CD | RA | T1D | T2D |
| BD | 121 | 0 | 20/40 (− 16.5%) | 12/42 (− 24.8%) | 22/43 (− 17.4%) | 16/42 (− 21.5%) | 14/47 (− 27.3%) |
| CAD | 135 | 12/29 (− 12.6%) | 0 | 18/38 (− 14.8%) | 15/29 (− 10.4%) | 12/33 (− 15.6%) | 10/33 (− 17.0%) |
| CD | 151 | 19/74 (− 36.4%) | 17/82 (− 43.0%) | 0 | 16/77 (− 40.4%) | 17/74 (− 37.7%) | 16/81 (− 43.0%) |
| RA | 78 | 3/21 (− 23.1%) | 2/19 (− 21.8%) | 10/21 (− 14.1%) | 0 | 13/21 (− 10.3%) | 6/21 (− 19.2%) |
| T1D | 78 | 2/33 (− 39.7%) | 7/35 (− 35.9%) | 14/34 (− 25.6%) | 11/35 (− 30.8%) | 0 | 9/37 (− 35.9%) |
| T2D | 131 | 11/43 (− 24.4%) | 19/39 (− 15.3%) | 11/48 (− 28.2%) | 4/39 (− 26.7%) | 26/43 (− 13.0%) | 0 |
Comparison of Joint GWAS gene list vs. Target GWAS gene list. For each of the six WTCCC diseases, this table shows the number of genes identified by all published GWAS of that disease and indexed in the NHGRI catalog. For each WTCCC disease, we compare the number of NHGRI genes identified by the Joint GWAS gene list (leading the slash) to the number identified by the Target GWAS gene list (trailing the slash). In parentheses, we show how many more NHGRI genes were identified by the Joint GWAS gene list than by the Target GWAS gene list, as a percent of the total number of NHGRI genes. Negative numbers indicate that more NHGRI genes were identified by single, Target Disease GWAS than by Joint GWAS.
| Target disease | Cross disease (joint GWAS gene list/target GWAS gene list, (% gain)) | ||||||
|---|---|---|---|---|---|---|---|
| Disease | NHGRI genes (n) | BD | CAD | CD | RA | T1D | T2D |
| BD | 130 | 0 | 8/6 (1.5%) | 8/6 (1.5%) | 4/6 (− 1.5%) | 7/6 (0.8%) | 4/6 (− 1.5%) |
| CAD | 92 | 6/7 (− 1.1%) | 0 | 5/12 (− 7.6%) | 7/12 (− 5.4%) | 13/13 (0.0%) | 6/12 (− 6.5%) |
| CD | 203 | 9/7 (1.0%) | 5/5 (0.0%) | 0 | 10/17 (− 3.4%) | 13/20 (− 3.4%) | 11/18 (− 3.4%) |
| RA | 66 | 5/6 (− 1.5%) | 8/5 (4.5%) | 6/7 (− 1.5%) | 0 | 6/11 (− 7.6%) | 6/11 (− 7.6%) |
| T1D | 61 | 4/3 (1.6%) | 7/4 (4.9%) | 7/7 (0.0%) | 5/7 (− 3.3%) | 0 | 2/12 (− 16.4%) |
| T2D | 105 | 13/14 (− 1.0%) | 17/12 (4.8%) | 12/14 (− 1.9%) | 14/11 (2.9%) | 15/17 (− 1.9%) | 0 |
Comparison of Joint GWAS gene list vs. Target GWAS gene list, considering functional overlap of NHGRI genes. For each of the six WTCCC diseases, this table shows the number of genes identified by all published GWAS of that disease and indexed in the NHGRI catalog. For each WTCCC disease, we compare the number of NHGRI genes mapped to a functional category including a gene from Joint GWAS gene list (leading the slash) to the number identified to the number mapped to a functional category including a gene from the by the Target GWAS gene list (trailing the slash). This shows the difference in identified functional gene clusters for each pair of diseases using the Joint GWAS method and the identified functional gene clusters for each Target Disease considered singly. In parentheses, we show how many more NHGRI genes were identified by the functional categories of Joint GWAS genes than by single, Target GWAS genes, as a percent of the total number of NHGRI genes. Results are dependent upon DAVID parameters (significant thresholds, pathway providers included in the aggregation). (*) indicates Joint GWAS gene lists that resulted in significantly lower false-positive rates than Target GWAS gene lists; significance assessed by Chi-square test (or Fisher's exact test in cases of low sample size).
| Target disease | Cross disease (joint GWAS gene list/target GWAS gene list, (% gain)) | ||||||
|---|---|---|---|---|---|---|---|
| Disease | NHGRI genes (n) | BD | CAD | CD | RA | T1D | T2D |
| BD | 130 | 0 | 77/79 (− 1.5%) | 79/83 (− 3.1%) | 80/80 (0.0%) | 76/84 (− 6.2%) | 69/80 (− 8.5%) |
| CAD | 92 | 42/36 (6.5%) | 0 | 45/45 (0.0%) | 47/44 (3.3%) | 49/58 (− 9.8%) | 46/43 (3.3%) |
| CD | 203 | 121/122 (− 0.5%) | 112/97 (7.4%) | 0 | 110/132 (− 10.8%) | 114/136 (− 10.8%) | 132/134 (− 1.0%) |
| RA | 66 | 41/44 (− 4.5%) | 44/48 (− 6.1%) | 42/40 (3.0%) | 0 | 39/48 (− 13.6%) | 44/48 (− 6.1%) |
| T1D | 61 | 35/33 (3.3%) | 35/36 (− 1.6%) | 35/36 (− 1.6%) | 34/36 (− 3.3%) | 0 | 30/35 (− 8.2%) |
| T2D | 105 | 67/61 (5.7%) | 66/58 (7.6%) | 64/69 (− 4.8%) | 67/69 (− 1.9%) | 68/64 (3.8%) | 0 |