| Literature DB >> 34001247 |
Liuyang Wang1, Thomas J Balmat2, Alejandro L Antonia1, Florica J Constantine3, Ricardo Henao3, Thomas W Burke3, Andy Ingham2, Micah T McClain3,4,5, Ephraim L Tsalik1,3,4,5, Emily R Ko3,6, Geoffrey S Ginsburg3, Mark R DeLong2, Xiling Shen7, Christopher W Woods3,4,5, Elizabeth R Hauser8,9, Dennis C Ko10,11.
Abstract
BACKGROUND: While genome-wide associations studies (GWAS) have successfully elucidated the genetic architecture of complex human traits and diseases, understanding mechanisms that lead from genetic variation to pathophysiology remains an important challenge. Methods are needed to systematically bridge this crucial gap to facilitate experimental testing of hypotheses and translation to clinical utility.Entities:
Keywords: Colocalization; Cross-phenotype association; Gout; Hi-HOST; Idiopathic pulmonary fibrosis; LD-score; Macular telangiectasia; PheWAS; Pleiotropy; rs12610495; rs2869462; rs505922
Mesh:
Year: 2021 PMID: 34001247 PMCID: PMC8127495 DOI: 10.1186/s13073-021-00904-z
Source DB: PubMed Journal: Genome Med ISSN: 1756-994X Impact factor: 15.266
A summary of GWAS data in iCPAGdb
| Type | Traits/diseases # | SNPs ( | Trait-SNP associations # | Website | |
|---|---|---|---|---|---|
| NHGRI catalog | Clinical GWAS | 3793 | 63,933 | 85,639 | |
| H2P2 | Molecular/ cellular GWAS | 79 (44 flow cytometric phenotypes + 35 cytokines) | 17 | 3489 ( | |
| Blood metabolites | Molecular GWAS | 491 blood (453 metabolites + 38 xenobiotics) | 1441 | 2024 | |
| Urine metabolites | Molecular GWAS | 55 urine | 149 | 171 | |
| Sum | 4418 | 65,540 | 91,323 |
GWAS summary statistics were clumped to include only a lead SNP for each trait locus
Fig. 1An improved method for finding shared genetic architecture of human traits. a The overall framework of the iCPAGdb pipeline. GWAS summary statistics (from published GWAS datasets or from user-uploaded GWAS) undergo LD clumping to obtain a lead variant for each signal below a specified p value threshold. These SNPs are queried against an LD proxy database generated from 1000 Genomes African, Asian, or European population to identify cross-phenotype associations through direct overlap or LD proxy at R2 > 0.4. Significance of overlap for each trait pair was calculated using Fisher’s exact test. Outputs can be visualized/downloaded from the iCPAGdb web browser. b Comparison of the number of shared SNPs for each NHGRI-EBI GWAS catalog trait pair identified through direct overlap vs. both direct and indirect (LD-proxy) overlap. c iCPAGdb detected more significant cross-phenotypes associations than CPAG1 at FDR < 0.1. Expansion of the NHGRI-EBI GWAS catalog and improvements in capturing by LD proxy in iCPAGdb fueled a large increase in detected cross-phenotype associations across human traits. Comparisons between CPAG1 and iCPAGdb on the same 2013 dataset are in Additional file 5: Figure S3. d Circle plot of cross-phenotype associations detected by iCPAGdb in the NHGRI-EBI GWAS catalog. After excluding compound phenotypes (phenotypes described by NHGRI-EBI GWAS catalog as > 1 comma-separated phenotype in their ontology), a total of 1709 traits involved in a total of 53314 cross-phenotype associations were left. These were categorized into 17 EFO Parental groups. Inner ribbons link phenotypes connected by cross-phenotype associations with the width of ribbon corresponding to the number of cross-phenotype associations. The axis outside the circle represents the cumulative number of associations for each group vs all other groups. e Comparison of genetic correlation from LD score regression (LDSC) and the Chao-Sorensen similarity index implemented in iCPAG demonstrates significant correlation. The genetic correlation r of 24 diseases/trait were obtained from [23]. Since Chao-Sorensen values are bounded from 0 to 1 and r ranges from − 1 to 1, we used the absolute value of r here. Colored * indicates significant trait-pair for LDSC, iCPAGdb, or both at false discovery rate of 0.1. f A model demonstrating how SNPs regulate uric acid levels to impact the development of kidney stones and gout. g Riverplot of gout cross-phenotype associations generated from iCPAGdb output shows mapped genes associated with gout by GWAS (left) connected with NHGRI-EBI GWAS phenotypes grouped into EFO categories (right; colors are different categories). Cross-phenotype associations include causal connections (such as uric acid levels), comorbid outcomes (such as kidney stones), and regulators of disease (such as alpha-1-antitrypsin levels)
Fig. 2iCPAGdb integrates GWAS of different scales to reveal a biological connection between MacTel 2 and serine. a Multi-dataset network of cross-phenotype associations detected by iCPAGdb. Phenotypes that demonstrated significant overlap (FDR ≤0.1) are color-coded in the indicated colors. b Riverplot of macular telangiectasia type 2 (MacTel type 2) cross-phenotype associations generated from iCPAGdb shows mapped genes associated with MacTel type 2 (left) connected with NHGRI-EBI GWAS phenotypes grouped into EFO categories (right; colors are different categories). SNPs in CPS1 and PHGDH are associated with MacTel type 2 and are also associated with serine levels, which are believed to play a causal role in the disease. Other connections may represent causal connections, comorbid outcomes, and regulators of disease. c Cross-phenotype associations connecting MacTel type 2 and serine. One locus demonstrated direct SNP overlap (rs715). A second locus demonstrated indirect overlap based on 4 SNPs in LD as visualized in the heatmap color-coded by LD. d A model for how SNPs regulate serine levels to impact pathogenesis of MacTel type 2 based on iCPAGdb and prior work described in the text
Fig. 3Cross-phenotype association analysis reveals the same genetic locus impacts both Chlamydia-induced CXCL10 levels and MIG level in serum. a Regional Miami colocalization plot demonstrates a genetic locus that impacts both CXCL10 level in lymphoblastoid cell lines following Chlamydia trachomatis infection and CXCL9 (MIG) levels in serum. b Comparison of -log10(p value) for GWAS of CXCL10 following C. trachomatis infection and levels of CXCL9 (MIG) in serum. The lead SNP in the region for each phenotype is marked. c Scatter plot demonstrates a highly positive correlation of the effect coefficients of cellular CXCL10 after C. trachomatis infection and of SNPs associated with blood CXCL9 levels. Each dot represents a SNP which has p value < 0.01 for both phenotypes. A total of 413 SNPs from a 4-mb window surrounding the leading SNP rs2869462 was selected. The blue vertical or red horizontal bar shows the standard error of the beta value for each SNP
COLOC analysis output
| Trait1 | Trait2 | Locus | SNP # | PP3 | PP4 | PP3 + PP4 | PP4/PP3 | Lead causal SNP |
|---|---|---|---|---|---|---|---|---|
| CXCL10 level after | Blood CXCL9 levels | CXCL10 | 1533 | 0.101 | 0.899 | 1.00 | 8.91 | rs2869462 |
| COVID-19 | Plasma CD209 antigen level | ABO | 56 | 0.0159 | 0.984 | 1.00 | 61.72 | rs505922 |
| COVID-19 | Idiopathic pulmonary fibrosis | DPP9 | 1233 | 0.00216 | 0.994 | 0.996 | 459.63 | rs12610495 |
PP3 is the posterior probability for the model where the two traits have independent causal variants. PP4 is the posterior probability for the model where the two traits share a single causal variant
Fig. 4Cross-phenotype association of ABO reveals a possible role for CD209 in severe COVID-19. a A network of genetic associations involving severe COVID-19. Each node represents either a disease/trait (filled circles) or a gene (dark blue diamond). The ABO locus was associated with multiple other diseases and levels of specific proteins, while DPP9 connects COVID-19 only with IPF and interstitial lung disease (idiopathic interstitial pneumonia). b Regional Miami colocalization plot demonstrates the ABO locus impacts both CD209 protein levels and risk of severe COVID-19. c A significant positive correlation for effect size of SNPs in the ABO locus on CD209 protein levels and risk of severe COVID-19. d Model of how ABO may affect CD209 and severe COVID-19
Fig. 5Cross-phenotype analysis and COVID-19 patient transcriptomics reveals a role for DPP9 in severe COVID-19. a Lung eQTL data from GTEx shows rs12610495 “G” allele is associated with reduced expression of DPP9. b Regional Miami colocalization plot demonstrates the DPP9 locus impacts both idiopathic pulmonary fibrosis and risk of severe COVID-19. c A significant positive correlation for effect size of SNPs in the DPP9 locus on idiopathic pulmonary fibrosis and risk of severe COVID-19. d Model of how DPP9 may affect idiopathic pulmonary fibrosis and risk of severe COVID-19. e DPP9 expression in peripheral blood is significantly higher in COVID-19 patients (n = 77 samples) compared to healthy (n = 19) and bacteria-infected patients (n = 23). The p values were calculated using the Wilcoxon rank-sum test. f COVID-19 patients demonstrate significantly higher DPP9 expression compared to healthy controls during early (days 1–10; n = 19 samples), middle (days 11–20; n = 36), and late (21+ days; n=22) stages of SARS-CoV-2 infection. The p values were calculated using the Wilcoxon rank-sum test. g DPP9 demonstrates increased expression during recovery from COVID-19. A total of 11 patients were measured sequentially at enrollment (day 0), day 7, and day 14. The colored dash line connects measurements from the same patient across time points. p value was calculated using Friedman test. h Decreased symptom severity scores of COVID-19 patients over time. The eleven subjects in G were assessed for symptom severity at days 0, 7, and 14. The colored dash line connects measurements from the same patient across time points. p value was calculated using Friedman test